Kaggle is a data science community website that hosts numerous guides, datasets, and runs online competitions. Recently they added the feature of Community Competitions where anybody can host a competition using Kaggle's platform.
Kaggle is a fantastic resource and I to encourage you to explore it, this assignment is going to be run using Kaggle's platform.
The assignment is semi-private in that you need a private key to access the competition. This key is listed in the main page on Moodle for this module.
Your task is to analyse the given dataset and construct a suitable classifier. The performance of your classifier is computed based on your uploaded predictions.
train.csv and test.csv and place into subfolder orig.train.csv into your train and test datasets.test.csv and upload.Since your Kaggle and Moodle accounts are separate, you will also need to upload to Moodle:
submission.csv your predictions in the format required by Kaggle.To allow/reward people for trying reasonable techniques that might not have a positive benefit on the final model score. The grade for this assignment consists of 60% on the model performance and 40% on the workflow in the notebooks uploaded.