Data Science 101
Intro
Outline
- First approach
- Course outline
- Linear Algebra
- Next
- Bonus (?)
Data Science
First approach
Data Science = Machine Learning in practice
Data Science = Data cleaning + data transformation + data processing + data engineering + machine learning + data visualisation
[...]
Machine Learning definition
- Arthur Samuel (1959).Machine Learning: Field of study that gives computers the ability to learn without being explicitly programmed.
Machine Learning definition
- Arthur Samuel (1959).Machine Learning: Field of study that gives computers the ability to learn without being explicitly programmed.
- Tom Mitchell (1998). Well-posed Learning Problem: A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.
“A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.”
Suppose your email program watches which emails you do or do not mark as spam, and based on that learns how to better filter spam. What is the task T in this setting?
- Classifying emails as spam or not spam.
- Watching you label emails as spam or not spam.
- The number (or fraction) of emails correctly classified as spam/not spam.
“A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.”
Suppose your email program watches which emails you do or do not mark as spam, and based on that learns how to better filter spam. What is the task T in this setting?
- Classifying emails as spam or not spam. => T
- Watching you label emails as spam or not spam. => E
- The number (or fraction) of emails correctly classified as spam/not spam. => P
Machine learning algorithms:
- Supervised learning
- Unsupervised learning
Others: Reinforcement learning, recommender systems.
Also talk about: Practical advice for applying learning algorithms
Supervised Learning
House price prediction based on size

Supervised learning: Right answers given
Regression: Predict continuous value output (price)
Example:
A house of 65 m2 sold for 440k
Supervised Learning
House price prediction

With a new house with 30 m2, we would predict a price of 270k
Supervised Learning
House price prediction
With a new house with 30 m2, we would predict a price of 350k

Supervised Learning
Training set
Learning algorithm
h
Size of the house
Estimated price
Hypothesis
maps size of house to price
Supervised Learning
Breast cancer: Is a tumor malignant (1) or not (0)?
Tumor size
1

0
Classification: Discrete valued output
Unsupervised Learning
Tumor size
1
0


Supervised:
Labelled history to learn from
Unsupervised:
Unlabelled data
Unsupervised Learning

Learning from the data. Here, clustering data together.
Unsupervised Learning
In real life

Statistics
Try to squeeze your data into a box and I will perfectly resolve the problem.
Machine Learning
Give me your data and I will do my best
Course outline
Course outline
- Linear regression with one variable (how to draw a line)
- Linear regression with multiple variables (how to draw a curved line)
- Logistic Regression (how to take a decision based on a line)
- Pratical ML (how to draw lines in real life)
- Other models (Let's stop with the lines)
- SVM
- Decision Tree, Ensemble
- Boosting (AdaBoost)
- ...
=> data science competition
Course outline
- Unsurpervised learning (how to draw circles)
- Theory
- PCA / SVD
- KNN
- Mixture of Gaussian / EM
- Hierarichal Clustering
- Deep Learning (How to draw lines: The Return)
Course outline
- Rest should be decide together between:
- More Deep Learning
- Reinforcement
- Recommender
- NLP
- Data science for competition
- Computer Vision
- ...
Linear Algebra
Basics
Linear Algebra
4 x 2 matrix
Dimension of matrix: number of rows x number of columns
Linear Algebra
\( A_{i,j} \) = "\( i \),\( j \) entry" in the \( i^{th} \) row, \( j^{th} \) column
\( A_{1,1} \) = 1402
\( A_{3,1} \) = 1639
Linear Algebra
Vector = \( n \times 1 \) matrix (= list)
\( y \in \mathbb{R^{4}} \)
\( y_{i} \) = \( i^{th} \) element
Linear Algebra
Addition and scalar multiplication
Be careful to have the same dimension
Linear Algebra
Matrix - vector multiplication
\( \mathbb{R^{3\times2}} \) \( \mathbb{R^{2\times1}} \) \( \mathbb{R^{3\times1}} \)
Linear Algebra
Matrix - vector multiplication
\( \mathbb{R^{3\times2}} \) \( \mathbb{R^{2\times1}} \) \( \mathbb{R^{3\times1}} \)
Linear Algebra
Matrix - vector multiplication
\( \mathbb{R^{3\times2}} \) \( \mathbb{R^{2\times1}} \) \( \mathbb{R^{3\times1}} \)
Linear Algebra
Matrix - vector multiplication
\( A \in \mathbb{R^{m \times n}} \)
\( x \in \mathbb{R^{n \times 1}} \)
\( y \in \mathbb{R^{m \times 1}} \)
To get \( y_i \), multiply \( A \)'s \( i^{th} \) row with elements of vector \( x \), and add them up.
Linear Algebra
Matrix - Matrix multiplication
Linear Algebra
Matrix - vector multiplication
\( A \in \mathbb{R^{m \times n}} \)
The \( i^{th} \) column of matrix \( C \) is obtained by multiplying \( A \) with the \( i^{th} \) column of \( B \) (for \( i = 1,2,...,o \) )
\( B \in \mathbb{R^{n \times o}} \)
\( C \in \mathbb{R^{m \times o}} \)
Linear Algebra
Matrix multiplication: some properties
Given \( A \in \mathbb{R^{m \times n}} \), \( B \in \mathbb{R^{n \times o}} \) and \( C \in \mathbb{R^{o \times p}} \)
In general, \( A \times B \neq B \times A \) (not commutative)
\( (A \times B) \times C = A \times (B \times C) \) (associative)
Linear Algebra
Identity Matrix
Denoted \( I \) (or \( I_{n \times n} \) or \( I_n \))
Examples:
For any matrix \( A \),
\( A \times I = I \times A = A \)
Linear Algebra
Tranpose
Example:
Definition:
Let \( A \) be a \( n \times m \) matrix, and let \( B = A^T \).
Then \( B \) is a \( m \times n \) matrix and \( B_{i,j} = A_{j,i} \)
The words printed here are concepts. You must go through the experiences.
– Carl Frederick
What you have to do
Computer Science
Pratice
Do at least exercices on 1-Intro-Python.pdf, 2-Numpy.pdf and 5-BDD.pdf
If you are good, do 3-Scipy.pdf and 4-Projet-Climat.pdf
Bonus: Some statistical boxes
Bayes' Theorem
For a hypothesis H and an event E
\( P(H \mid E) \) means "Probability of H given E"
Bayes' Theorem
Given H = Vincent killed his wife
Given E = Vincent's fingerprints are on the murderer knife
$$ P(H \mid E) = \frac{P(E \mid H) \, P(H)}{P(E)} $$
\( P(E \mid H) \) = Probability of having his fingerprints on the murderer knife if Vincent actually killed his wife
\( P(H) \) = Probability of Vincent killing his wife (motive)
\( P(E) \) = Probability of his fingerprints to be on the knife
Linear Algebra
Inverse
Data Science 101.0
By ycarbonne
Data Science 101.0
- 490