Meetup #18 Agenda
Meetup status , stats/figure , trends
Around 1k members , 3/4 meetup by year
Call for new organizers:
contact speakers, find hosting / sponsors
Link to David's slides , CERN et al. sponsorship
Two parts challenge :
1°) Precision ( Ended competition)
https://www.kaggle.com/c/trackml-particle-identification
https://sites.google.com/site/trackmlparticle/results
2°) Performance open until march 2019
https://competitions.codalab.org/competitions/20112
Dedicated site : https://sites.google.com/site/trackmlparticle/
Unsupervised ML challenge, EDA
Size Train 46 Go for train_1 Test
Metric :
Home course machine learning explainability
3 parts / Notebooks
https://www.kaggle.com/dansbecker/permutation-importance
https://www.kaggle.com/dansbecker/partial-plots
https://www.kaggle.com/dansbecker/shap-values
Home course machine learning explainability
3 parts / Notebooks
Motivations , Use Cases
https://www.kaggle.com/dansbecker/use-cases-for-model-insights
For history see
https://medium.com/@Zelros/a-brief-history-of-machine-learning-models-explainability-f1c3301be9dc
https://www.kaggle.com/dansbecker/permutation-importance
Take away
sort of feature importance : after shuffling a column (permutation) ,
you see the consequence on the accuracy of your model > performance decrease
eli5 python lib , scikit-learn 0.20+
very similar to drift computation
in MLbox for example
https://www.kaggle.com/dansbecker/partial-plots
Take away
partial dependence plots show how
a feature affects predictions
act like coefficients in the linear or
logistic regression
pdpbox python lib
You can also compute the dependance
of 2 features
https://www.kaggle.com/dansbecker/shap-values
Take away
SHAP Values (an acronym from SHapley Additive exPlanations) break down a prediction to show the impact of each feature.
How the feature affect the prediction on "Man of the match"
in red/pink features that increase the pred. in blue feat. that decrease the pred.
shap python lib