Sr. Data Scientist at TrueAccord
PhD in Physics
Kaggle top 100
https://www.kaggle.com/anokas/data-exploration-analysis/notebook
Way to get => brute force boundary match using L2 distance
=>
fight for 0.0001
=>
stacking
Google Drive => Predictions on train / test per fold in hdf5.
Stratified in a loop starting from the rarest labels
For each model, for each fold we generate prediction on val and test
For each model, for each fold we generate prediction on val and test
For each model, for each fold we generate prediction on val and test
For each model, for each fold we generate prediction on val and test
numpy + ImgAug + OpenCV
https://github.com/aleju/imgaug
It is still possible to get 0.93+ on Tiff.
https://www.kaggle.com/bguberfain/tif-to-jpg-by-matching-percentiles
48 networks
*
10 folds
=
480 networks
ExtraTrees
NN
Weighted
average
Threasholding
LR
Mean
Worked the best:
On the Bayes-optimality of F-measure maximizers
https://arxiv.org/abs/1310.4849
Q: How many networks do we need to make it a product?
A: One.