Module Introduction

Since this modules is a continuation of the Data Mining 1 module of last semester, we will start with a quick review of what you have covered to date and what we will be doing this semester.

Feature Engineering

Last semester you learnt some data preparation techniques, such as scaling and transformations. Now we will build on that by covering techniques to generate new features from existing features.

Hyperparameter Tuning

Hyperparameters tunning is an essential part of the machine learning process, but is time consuming. We will look at the standard techniques (Grid/Random search) and more advanced strategies using the Bayesian optimisation (hyperopt) libraries.

Ensemble Learning

Next we look at techniques to combine multiple classifiers/regressors to produce a model with better predictive performance than a single classifier.

Text Mining

Our next topic is text mining, where we look at techniques to discover patterns in unstructured text.

eXplainable AI

This is a short topic/lecture on eXplainable AI.

Neural Networks

Deep learning - a brief look into deep neural networks.

  • Neural network basis
  • sklearn's MLP vs PyTorch

Assignments

Details and resources for module assignments.

  • Building a Dashboard — TIMSS 2023 (Ireland)
  • Kaggle style competition — TBA
  • Assignment 3 — TBC