Grab the slides:

Every Monday 5pm UK time

Big Questions

How can we compare which machine learning model is better?


Why we choose this model over the other?

How do we define acurate?


If it's good on training data is it good on anything?

We need metrics, measurements to determin how good the performance of the model is


Common metrics:



precision and recall

F1 score



The percentage of correct labels


Although it provide a general measurement of how good a model is, it is not always a good measurement

Identify terrorists trying to board flights

Anyone can provide a model with greater than 99% accuracy

By predicting all passagers are not terrorists

In this case, accuracy is not a good measurement because:


  • The data is imbalance
  • The stakes of making a mistake are high
  • fIdentifying the positive cases (minority) is prefered


Recall can be thought as of a model’s ability to find all the data points of interest in a dataset.

 While recall expresses the ability to find all relevant instances in a dataset, precision expresses the proportion of the data points our model says was relevant actually were relevant.


F1 score


F1 score is a combination of precision and recall

Confusion Matrix

Roc Curve

Over-fitting vs Under-fitting

Low training score,

Low testing score

High training score,
Low testing score

High training score,
High testing score

Next week:

Boosting Algrithms

Every Monday 5pm UK time

The Legend of Data - Comparing models

By Cheuk Ting Ho

The Legend of Data - Comparing models

  • 894