Support Vector Machines (SVM)
Kernel Methods
Kernel Ridge Regression
Bayesian Probability Theory
Bayesian Networks
Naive Bayes
Perceptrons
Neural Networks
Principle Component Analysis
Dimensionality Reduction
Learning Theory
RDF Networks
Collaborative Filtering
Regression
GA-QMML: Prediction of Molecular Properties (QM), through Genetic Algorithm (GA) Optimisation and Non-Linear Kernel Ridge Regression (ML)
Ridge Regression
Machine Learning introduction
Types of Learning
Supervised, Semisupervised, Unsupervised, Reinforcement
ML Applications
Classification, Regression, Clustering, Recommender Systems, Embedding
Theoretical Introduction: Regression
Bias-Variance trade-off, Overfitting, Regularization
“A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.” -- Tom Mitchell, Carnegie Mellon University
Supervised
Unsupervised
Semisupervised
Reinforcement Learning
From data to discrete classes
?
?
?
?
?
?
?
?
Predicting numeric value
Discovering structure in data
Natural document clusters of bio-medical research
Finding what user might want
Visualising data
Article classification via t-Distributed Stochastic Neighbor Embedding (t-SNE)
Word classification via t-SNE
Picture classification via t-SNE
LinkedIn social graph. Blue: cloud computing, Green: big data, dark orange: co-workers, light orange: law school, purple: former employer
"Naive" conditional independance assumption
Supervised Learning:
Goal
Simple models may not fit the data
Complex models may not be applicable to new, as of yet unseen data
choice of hypothesis class introduces learning bias
more complex class, less bias, but more variance
Error can be decomposed:
Choice of hypothesis class introduces learning bias
Training Set Error:
Prediction Error:
Why doesn't training error approximate prediction error?
training error good estimate for single w, but w was optimised with respect to training data, and found that w was good for this set of samples
Test Set Error
Given dataset D, randomly split into two parts:
Use Training Data to optimise w
For the final output w', evaluate error once using:
A learning algorithm overfits the training data if it outputs a solution w' when there exists another solution w'' such that:
Overfitting typically leads to very large parameter choices
Regularised regression aims to impose a complexity restriction by penalising large weights
better on training data
worse on testing data
Ridge Regression
Lasso Regression
Larger
more penalty, smoother function, more bias
Smaller
more flexible, more variance
Randomly divide data into k equal parts
Learn classifier
Estimate error of
on validation set
using data not in