Tutorial: Up and down the PyData Stats Stack


Or Lies damned lies and statistics

Peadar Coyle

I work as a Senior Data Scientist at Channel 4

We're a media company, and we leverage data for targeted advertising, customer analytics and recommendation engines

We're hiring, so have a chat with me if interested.

Who am I?

What else do I do?

  • Open Source Contributor to PyMC3
  • Mathematics and Physics background
  • Fellow of the Royal Statistical Society and member of NumFOCUS
  • Author of interviews with data scientists book :) 

Stats is everywhere...

What was wrong with this?

More or Less debunked this - the sample was wrong!

Outcomes of this tutorial

  • Have three tools to attack the same problem
  • Some tricks and tips like hypothesis testing and feature selection
  • Understand how to interpret and debug three versions of Logistic Regression. ML, Frequentist Stats and Bayesian


Different schools of data 

HT: Vincent D. Warmerdam

Firstly hypothesis testing

  • Or how to do t-tests and all that stuff

Three schools


  • ScikitLearn 
  • Statsmodels
  • PyMC3


Bayesian tooling



How likely am I to make more than $50K?


Frequentist Logistic Regression


