To keep the process flexible across all datasets, we will only do the minimal amount of preprocessing — fill NA by zero (and convert multi-class to binary for XGBoost.)
1 2 3 4 | |
Split the dataset into train, validation and test set:
1 2 3 4 5 | |
Visualize the first element in X_train
1 2 3 4 5 | |
and the first target
1 | |