Kaggle Paris Meetup
Hacking the Otto Group Challenge !
at La Paillasse
5 may 2015
Agenda
- Challenge presentation
- Walk through the forums
- Tour de table
- Work by thematic sub group
- Resto (Food & Drink to order)
- Group shifting or continue
- Feedback
- Next session proposal(s)
Challenge Presentation
- e-commerce group -> product classification challenge
9 classes (main product categories)
- 93 features (event counts) - sparse data - max€[14,352]
- train data: 61,878 products - test data: 144,368 products
- score: multiclass log loss (likelihood)
- >3000 teams - best score: around 0.39, very few teams below 0.40
- visualise the data: https://www.kaggle.com/c/otto-group-product-classification-challenge/forums/t/13585/share-your-scripts-visualizations
Walk through the forum (1/2)
- Random Forest 0.54
- https://github.com/ottogroup/kaggle/blob/master/benchmark.py
- Gradient boosting (Xg_Boost) 0.508 , 0.43
- http://https://www.kaggle.com/c/otto-group-product-classification-challenge/forums/t/12947/achieve-0-50776-on-the-leaderboard-in-a-minute-with-xgboost
- Neural Network (Lasagne) 0.483
with 2 hidden layers
0.443 with 3 hidden layers + 3 dropout layers + Bagging (3 runs)
Non exhaustive reading of performance / results claimed by some competitors
Walk through the forum (2/2)
Naive Bayes, ZeroR ?
Ensemble approach :
"get 0.43702 with nolearn/lasagne and 0.43447 using extreme gradient boosting in Graphlab. Blending these 2 methods allowed me to get 0.42815"
Non exhaustive reading of performance / results claimed by some competitors
KaggleParisMeetup-20150505
By bruno16
KaggleParisMeetup-20150505
Slides for Kaggle Paris Meetup
- 1,929