Image captioning
Image generation
Profile customers by creating charcteristic groups
Data type
Output
Data processing
Clustering
algorithms
Association rule
Personal data
Data from a product/service use
Global customer tendencies
Classify customer in characteristic groups
...
Profile customers by creating charcteristic groups
Data
Results
Contract & calls data from customers of a phone company
K-Means
Clusters of customers
data preparation
Exploration of clusters specificity
Data processing
Predict client that will unsuscribe
Data type
Output
Classification
algorithms
Personal data
Data from a product/service use
Predict if a new client will churn
...
Data processing
Data
Results
Contract & calls data from customers of a phone company
Random Forest
Performance metrics
data preparation
Data processing
ROC curve
confusion matrix
| 1 |
|---|
| 0 |
| 1 |
| 0 |
churn
+
Predict client that will unsuscribe
Principe:
trouver une règle la plus optimale pour partitionner les données en "clusters" homogènes
Classify spontaneous sentiment from textual data
Text data
Output
Classification
algorithms
reviews
social network
messages
Predict customers opinion polarity
...
Data processing
Pre trained model
or
NLP dictionnaries
Data
Results
tweets from airlines's customer
SVM
Performance metrics
data preparation
Data processing
precision/recall
confusion matrix
| 1 |
|---|
| 0 |
| -1 |
| 0 |
sentiment
+
Classify spontaneous sentiment from textual data
sensitivit/specificity
Support Vector Machine
Maximise la distance entre une frontière de décision et différentes classes d'échantillons
All models are wrong, but some are usefull, Georges Box
You need to choose your evaluation metric according to your task :
accuracy
F1 score
precision/recall
area under ROC curve
mean squared error
pourcentage of explained variance
inter-clusters distance mesures
Classification
Regression
Clustering
cluster homogeneity
based
measure
sensitivity/specificity
Predicted condition
Actual condiction
Confusion Matrix ... and associated metrics
TP
TP + FP
precision
=
TP
TP + FN
recall
=
fraction of real positive observations among those predicted positive
fraction of real positive observations among those really positive
| Positive (P) |
| Positive (P) |
| True Positive (TP) |
| True Negative (TN) |
| False positive (FP) |
| False negative (FN) |
| Negative (N) |
| Negative (P) |
Predicted condition
Actual condiction
precision x recall
precision + recall
F1 score
=
2
Confusion Matrix ... and associated metrics
TP
TP + FP
precision
=
TP
TP + FN
recall
=
| Positive (P) |
| Positive (P) |
| True Positive (TP) |
| True Negative (TN) |
| False positive (FP) |
| False negative (FN) |
| Negative (N) |
| Negative (P) |
Predicted condition
Actual condiction
TN = 7+9+3+5
TP = 5
| Apple |
| Orange |
| Mango |
| Apple | Orange | Mango |
| 5 |
| 2 |
| 1 |
| 5 |
| 7 |
| 3 |
| 9 |
| 5 |
| 2 |
| 5 |
FN = 2+5
FP = 2+1
calcul for an actual apple
Predicted condition
Actual condiction
Confusion Matrix ... and associated metrics
TN
FP + TN
specificity
=
TP
TP + FN
sensitivity
=
fraction of real negative observations among those really negative
fraction of real positive observations among those really positive
| Positive (P) |
| Positive (P) |
| True Positive (TP) |
| True Negative (TN) |
| False positive (FP) |
| False negative (FN) |
| Negative (N) |
| Negative (P) |
Predicted condition
Actual condiction
TP
P
true
positive rate
=
FP
P
false positive rate
=
| Positive (P) |
| Positive (P) |
| True Positive (TP) |
| True Negative (TN) |
| False positive (FP) |
| False negative (FN) |
| Negative (N) |
| Negative (P) |