This directory contains the Python source code, requirements, and data files, for training and evaluating a machine learning model using scikit learn.
The ./src
directory contains the Python code and data files, and the ./requirements.txt
file contains the Python dependencies.
Using your preferred Python environment isolation mechanism, install the dependencies listed in ./requirements.txt
.
The ./src
directory contains a file called exercise.py
, which trains a machine learning model for evaluation against a test dataset.
This file contains a number of bugs, and the CSV files may contain data quality issues which prevent the model from being trained and evaluated successfully.
Your task is to:
- correct bugs if the fixes are quick to implement
- if the fixes are not quick to implement, document how you would fix them / note what issues you have identified
- run the trained model against
test.csv
data if possible, if not document what changes would be required to do so
NB: You may recognise the dataset used in this exercise. You should use the versions of the data provided in this repository, rather than downloading from another source.