Data Science 101

Intro

Outline

  1. First approach
  2. Course outline
  3. Linear Algebra
  4. Next
  5. Bonus (?)

Data Science

First approach

Data Science = Machine Learning in practice

 

Data Science = Data cleaning + data transformation + data processing + data engineering + machine learning + data visualisation

[...]

Machine Learning definition

  •  Arthur Samuel (1959).Machine Learning: Field of study that gives computers the ability to learn without being explicitly programmed.

Machine Learning definition

  •  Arthur Samuel (1959).Machine Learning: Field of study that gives computers the ability to learn without being explicitly programmed.
  • Tom Mitchell (1998). Well-posed Learning Problem: A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.

“A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.”

Suppose your email program watches which emails you do or do not mark as spam, and based on that learns how to better filter spam. What is the task T in this setting?

  • Classifying emails as spam or not spam.
  • Watching you label emails as spam or not spam.
  • The number (or fraction) of emails correctly classified as spam/not spam.

“A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.”

Suppose your email program watches which emails you do or do not mark as spam, and based on that learns how to better filter spam. What is the task T in this setting?

  • Classifying emails as spam or not spam.         => T
  • Watching you label emails as spam or not spam.         => E
  • The number (or fraction) of emails correctly classified as spam/not spam.       => P

Machine learning algorithms:

  • Supervised learning
  • Unsupervised learning

 

Others: Reinforcement learning, recommender systems.

 

Also talk about: Practical advice for applying learning algorithms

Supervised Learning

House price prediction based on size

Supervised learning: Right answers given

Regression: Predict continuous value output (price)

Example:

A house of      65 m2 sold      for 440k 

Supervised Learning

House price prediction

With a new house with 30 m2, we would predict a price of 270k

Supervised Learning

House price prediction

With a new house with 30 m2, we would predict a price of 350k

Supervised Learning

 

Training set

Learning algorithm

h

Size of the house

Estimated price

Hypothesis

maps size of house to price

Supervised Learning

Breast cancer: Is a tumor malignant (1) or not (0)?

Tumor size

1

0

Classification: Discrete valued output

Unsupervised Learning

Tumor size

1

0

Supervised:

Labelled history to learn from

Unsupervised:

Unlabelled data

Unsupervised Learning

Learning from the data. Here, clustering data together.

Unsupervised Learning

In real life

Statistics

Try to squeeze your data into a box and I will perfectly resolve the problem.

 

Machine Learning

Give me your data and I will do my best

Course outline

Course outline

 

 

  • Linear regression with one variable (how to draw a line)
  • Linear regression with multiple variables (how to draw a curved line)
  • Logistic Regression (how to take a decision based on a  line)
  • Pratical ML (how to draw lines in real life)
  • Other models   (Let's stop with the lines)                                             
    • SVM
    • Decision Tree, Ensemble
    • Boosting (AdaBoost)
    • ...

=> data science competition

Course outline

 

 

  • Unsurpervised learning (how to draw circles)
    • Theory
    • PCA / SVD
    • KNN
    • Mixture of Gaussian / EM
    • Hierarichal Clustering
  • Deep Learning (How to draw lines: The Return)                                 

Course outline

 

 

  • Rest should be decide together between:                                            
    • More Deep Learning
    • Reinforcement
    • Recommender
    • NLP
    • Data science for competition
    • Computer Vision
    • ...

Linear Algebra

Basics

Linear Algebra

\begin{bmatrix} 1402 & 901 \\ 1379 & 843 \\ 1639 & 973 \\ 1103 & 789 \end{bmatrix}
[1402901137984316399731103789]\begin{bmatrix} 1402 & 901 \\ 1379 & 843 \\ 1639 & 973 \\ 1103 & 789 \end{bmatrix}

4 x 2 matrix

 

 

Dimension of matrix: number of rows x number of columns 

\mathbb{R^{4\times2}}
R4×2\mathbb{R^{4\times2}}

Linear Algebra

A = \begin{bmatrix} 1402 & 901 \\ 1379 & 843 \\ 1639 & 973 \\ 1103 & 789 \end{bmatrix}
A=[1402901137984316399731103789]A = \begin{bmatrix} 1402 & 901 \\ 1379 & 843 \\ 1639 & 973 \\ 1103 & 789 \end{bmatrix}

\( A_{i,j} \) = "\( i \),\( j \) entry" in the \( i^{th} \) row, \( j^{th} \) column

\( A_{1,1} \) = 1402

\( A_{3,1} \) = 1639

Linear Algebra

Vector = \( n \times 1 \) matrix (= list)

y = \begin{bmatrix} 1402 \\ 1379 \\ 1639 \\ 1103 \end{bmatrix}
y=[1402137916391103]y = \begin{bmatrix} 1402 \\ 1379 \\ 1639 \\ 1103 \end{bmatrix}

\( y \in \mathbb{R^{4}} \)

\( y_{i} \) =  \( i^{th} \) element

Linear Algebra

Addition and scalar multiplication

\begin{bmatrix} 1 & 0 \\ 2 & 5 \\ 3 & 1 \end{bmatrix} + \begin{bmatrix} 4 & 0.5 \\ 2 & 5 \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} 5 & 0.5 \\ 4 & 10 \\ 3 & 2 \end{bmatrix}
[102531]+[40.52501]=[50.541032]\begin{bmatrix} 1 & 0 \\ 2 & 5 \\ 3 & 1 \end{bmatrix} + \begin{bmatrix} 4 & 0.5 \\ 2 & 5 \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} 5 & 0.5 \\ 4 & 10 \\ 3 & 2 \end{bmatrix}
3 \times \begin{bmatrix} 1 & 0 \\ 2 & 5 \\ 3 & 1 \end{bmatrix} = \begin{bmatrix} 3 & 0 \\ 6 & 15 \\ 9 & 3 \end{bmatrix}
3×[102531]=[3061593]3 \times \begin{bmatrix} 1 & 0 \\ 2 & 5 \\ 3 & 1 \end{bmatrix} = \begin{bmatrix} 3 & 0 \\ 6 & 15 \\ 9 & 3 \end{bmatrix}

Be careful to have the same dimension

Linear Algebra

Matrix - vector multiplication

\begin{bmatrix} 1 & 3 \\ 4 & 0 \\ 2 & 1 \end{bmatrix} \times \begin{bmatrix} 1 \\ 5 \end{bmatrix} = \begin{bmatrix} 16 \\ 4 \\ 7 \end{bmatrix}
[134021]×[15]=[1647]\begin{bmatrix} 1 & 3 \\ 4 & 0 \\ 2 & 1 \end{bmatrix} \times \begin{bmatrix} 1 \\ 5 \end{bmatrix} = \begin{bmatrix} 16 \\ 4 \\ 7 \end{bmatrix}

\( \mathbb{R^{3\times2}} \)                 \( \mathbb{R^{2\times1}} \)              \( \mathbb{R^{3\times1}} \)

Linear Algebra

Matrix - vector multiplication

\begin{bmatrix} 1 & 3 \\ 4 & 0 \\ 2 & 1 \end{bmatrix} \times \begin{bmatrix} 1 \\ 5 \end{bmatrix} = \begin{bmatrix} 16 \\ 4 \\ 7 \end{bmatrix}
[134021]×[15]=[1647]\begin{bmatrix} 1 & 3 \\ 4 & 0 \\ 2 & 1 \end{bmatrix} \times \begin{bmatrix} 1 \\ 5 \end{bmatrix} = \begin{bmatrix} 16 \\ 4 \\ 7 \end{bmatrix}

\( \mathbb{R^{3\times2}} \)                 \( \mathbb{R^{2\times1}} \)              \( \mathbb{R^{3\times1}} \)

1 \times 1 + 3 \times 5 = 16
1×1+3×5=161 \times 1 + 3 \times 5 = 16

Linear Algebra

Matrix - vector multiplication

\begin{bmatrix} 1 & 3 \\ 4 & 0 \\ 2 & 1 \end{bmatrix} \times \begin{bmatrix} 1 \\ 5 \end{bmatrix} = \begin{bmatrix} 16 \\ 4 \\ 7 \end{bmatrix}
[134021]×[15]=[1647]\begin{bmatrix} 1 & 3 \\ 4 & 0 \\ 2 & 1 \end{bmatrix} \times \begin{bmatrix} 1 \\ 5 \end{bmatrix} = \begin{bmatrix} 16 \\ 4 \\ 7 \end{bmatrix}

\( \mathbb{R^{3\times2}} \)                 \( \mathbb{R^{2\times1}} \)              \( \mathbb{R^{3\times1}} \)

4 \times 1 + 0 \times 5 = 4
4×1+0×5=44 \times 1 + 0 \times 5 = 4

Linear Algebra

Matrix - vector multiplication

\begin{bmatrix} . & . & . & . \\ . & . & . & . \\ . & . & . & . \\ . & . & . & . \end{bmatrix} \times \begin{bmatrix} . \\ . \\ . \end{bmatrix} = \begin{bmatrix} . \\ . \\ . \\ . \end{bmatrix}
[................]×[...]=[....]\begin{bmatrix} . & . & . & . \\ . & . & . & . \\ . & . & . & . \\ . & . & . & . \end{bmatrix} \times \begin{bmatrix} . \\ . \\ . \end{bmatrix} = \begin{bmatrix} . \\ . \\ . \\ . \end{bmatrix}

\( A \in \mathbb{R^{m \times n}} \)

\( x \in \mathbb{R^{n \times 1}} \)

\( y \in \mathbb{R^{m \times 1}} \)

To get \( y_i \), multiply \( A \)'s \( i^{th} \) row with elements of vector \( x \), and add them up.

Linear Algebra

Matrix - Matrix multiplication

\begin{bmatrix} 1 & 3 & 2 \\ 4 & 0 & 1 \end{bmatrix} \times \begin{bmatrix} 1 & 3 \\ 0 & 1 \\ 5 & 2 \end{bmatrix} = \begin{bmatrix} 11 & 10 \\ 9 & 14 \end{bmatrix}
[132401]×[130152]=[1110914]\begin{bmatrix} 1 & 3 & 2 \\ 4 & 0 & 1 \end{bmatrix} \times \begin{bmatrix} 1 & 3 \\ 0 & 1 \\ 5 & 2 \end{bmatrix} = \begin{bmatrix} 11 & 10 \\ 9 & 14 \end{bmatrix}
\begin{bmatrix} 1 & 3 & 2 \\ 4 & 0 & 1 \end{bmatrix} \times \begin{bmatrix} 1 \\ 0 \\ 5 \end{bmatrix} = \begin{bmatrix} 11 \\ 9 \end{bmatrix}
[132401]×[105]=[119]\begin{bmatrix} 1 & 3 & 2 \\ 4 & 0 & 1 \end{bmatrix} \times \begin{bmatrix} 1 \\ 0 \\ 5 \end{bmatrix} = \begin{bmatrix} 11 \\ 9 \end{bmatrix}
\begin{bmatrix} 1 & 3 & 2 \\ 4 & 0 & 1 \end{bmatrix} \times \begin{bmatrix} 3 \\ 1 \\ 2 \end{bmatrix} = \begin{bmatrix} 10 \\ 14 \end{bmatrix}
[132401]×[312]=[1014]\begin{bmatrix} 1 & 3 & 2 \\ 4 & 0 & 1 \end{bmatrix} \times \begin{bmatrix} 3 \\ 1 \\ 2 \end{bmatrix} = \begin{bmatrix} 10 \\ 14 \end{bmatrix}

Linear Algebra

Matrix - vector multiplication

\begin{bmatrix} . & . & . & . \\ . & . & . & . \\ . & . & . & . \end{bmatrix} \times \begin{bmatrix} . & . & . \\ . & . & . \\ . & . & . \\ . & . & . \end{bmatrix} = \begin{bmatrix} . & . & . \\ . & . & . \\ . & . & . \end{bmatrix}
[............]×[............]=[.........]\begin{bmatrix} . & . & . & . \\ . & . & . & . \\ . & . & . & . \end{bmatrix} \times \begin{bmatrix} . & . & . \\ . & . & . \\ . & . & . \\ . & . & . \end{bmatrix} = \begin{bmatrix} . & . & . \\ . & . & . \\ . & . & . \end{bmatrix}

\( A \in \mathbb{R^{m \times n}} \)

The \( i^{th} \) column of matrix \( C \) is obtained by multiplying \( A \) with the \( i^{th} \) column of \( B \) (for \( i = 1,2,...,o \) )

\( B \in \mathbb{R^{n \times o}} \)

\( C \in \mathbb{R^{m \times o}} \)

Linear Algebra

Matrix multiplication: some properties

Given \( A \in \mathbb{R^{m \times n}} \), \( B \in \mathbb{R^{n \times o}} \) and \( C \in \mathbb{R^{o \times p}} \)

 

In general, \( A \times B \neq B \times A \) (not commutative)

 

\( (A \times B) \times C = A \times (B \times C) \) (associative)

Linear Algebra

Identity Matrix

Denoted \( I \) (or \( I_{n \times n} \) or \( I_n \))

Examples:

I_{2 \times 2} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}
I2×2=[1001]I_{2 \times 2} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}
I_{3 \times 3} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}
I3×3=[100010001]I_{3 \times 3} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}
I_{4 \times 4} = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}
I4×4=[1000010000100001]I_{4 \times 4} = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}

For any matrix \( A \),

\( A \times I = I \times A = A \)

Linear Algebra

Tranpose

Example:

A = \begin{bmatrix} 1 & 3 & 2 \\ 4 & 0 & 1 \end{bmatrix}
A=[132401]A = \begin{bmatrix} 1 & 3 & 2 \\ 4 & 0 & 1 \end{bmatrix}
B = A^T = \begin{bmatrix} 1 & 4 \\ 3 & 0 \\ 2 & 1 \end{bmatrix}
B=AT=[143021]B = A^T = \begin{bmatrix} 1 & 4 \\ 3 & 0 \\ 2 & 1 \end{bmatrix}

Definition:

Let \( A \) be a \( n \times m \) matrix, and let \( B = A^T \).

Then \( B \) is a \( m \times n \) matrix and \( B_{i,j} = A_{j,i} \)

The words printed here are concepts. You must go through the experiences.

– Carl Frederick

What you have to do

Pratice

First steps

Do at least exercices on 1-Intro-Python.pdf, 2-Numpy.pdf and 5-BDD.pdf

If you are good, do 3-Scipy.pdf and 4-Projet-Climat.pdf

Bonus: Some statistical boxes

Bayes' Theorem

For a hypothesis H and an event E

P(H \mid E) = \frac{P(E \mid H) \, P(H)}{P(E)}
P(HE)=P(EH)P(H)P(E)P(H \mid E) = \frac{P(E \mid H) \, P(H)}{P(E)}

\( P(H \mid E) \) means "Probability of H given E"

Bayes' Theorem

Given H = Vincent killed his wife

Given E = Vincent's fingerprints are on the murderer knife

 

$$ P(H \mid E) = \frac{P(E \mid H) \, P(H)}{P(E)} $$

 

\( P(E \mid H) \) = Probability of having his fingerprints on the murderer knife if Vincent actually killed his wife

\( P(H) \) = Probability of Vincent killing his wife (motive)

\( P(E) \) = Probability of his fingerprints to be on the knife

Linear Algebra

Inverse

Data Science 101.0

By ycarbonne

Data Science 101.0

  • 490