Pymc-Learn: Practical Probabilistic Machine Learning in Python

Daniel Emaasit

Data Scientist @ Haystax

PyData Washington DC, 2018

November 17, 2018

pymc-learn.org

There is a growing need

for

principled machine learning

by

non-ML specialists

Reasons for increased adoption of Probabilistic Modeling (1/3)

the need for transparent models with calibrated quantities of uncertainty.

Wall Street Journal. (Accessed 2018)

the ever-increasing number of promising results achieved in A.I.

Schneider et al., 2016. (Uber ATG) .

Reasons for increased adoption of Probabilistic Modeling (2/3)

TechCrunch, accessed 2018 (Google AutoML).

Wired magazine, accessed 2016. (Google Project Loon)

the emergency of probabilistic programming languages (PPLs).

Reasons for increased adoption of Probabilistic Modeling (3/3)

Gaussian Process in PyMC3

import pymc3 as pm

# Instantiate a model
with pm.Model() as latent_gp_model:
    
    # specify the priors
    length_scale = pm.Gamma("length_scale", alpha = 2, beta = 1)
    signal_variance = pm.HalfCauchy("signal_variance", beta = 5)
    noise_variance = pm.HalfCauchy("noise_variance", beta = 5)
    degrees_of_freedom = pm.Gamma("degrees_of_freedom", alpha = 2, beta = 0.1)
    
    # specify the kernel function
    cov = signal_variance**2 * pm.gp.cov.ExpQuad(1, length_scale)
        
    # specify the mean function
    mean_function = pm.gp.mean.Zero()
    
    # specify the gp
    gp = pm.gp.Latent(cov_func = cov)
    
    # specify the prior over the latent function
    f = gp.prior("f", X = X) 
    
    # specify the likelihood
    obs = pm.StudentT("obs", mu = f, lam = 1/signal_variance, nu = degrees_of_freedom, observed = y)


# Perform Inference
with latent_gp_model:
    posterior = pm.sample(draws = 100, njobs = 2)

# extend the model by adding the GP conditional distribution so as to predict at test data
with latent_gp_model:
    f_pred = gp.conditional("f_pred", X_new)

# sample from the GP conditional posterior
with latent_gp_model:
    posterior_pred = pm.sample_ppc(posterior, vars = [f_pred], samples = 200)

Build a model

Train a model

Prediction

Scikit-learn

scikit-learn.org

from sklearn.gaussian_process import GaussianProcessRegressor()

model = GaussianProcessRegressor()

model.fit(X_train, y_train)

model.predict(X_test, y_test)

model.score(X_test, y_test)

model.save('path/to/saved/model')

Few lines of code

Build + Train + Predict + Score + Save + Load

Pymc-learn

pymc-learn.org

Inspired by scikit-learn. Focus is on non-ML specialists

Pymc-learn

pymc-learn.org

from pmlearn.gaussian_process import GaussianProcessRegressor()

# Instantiate a PyMC3 Gaussian process model
model = GaussianProcessRegressor()

# Fit using MCMC or Variational Inference
model.fit(X_train, y_train)

model.predict(X_test, y_test)

model.score(X_test, y_test)

model.save('path/to/saved/model')

Mimics Scikit-Learn

Try it Online

bit.ly/pymc-learn-dc

Thank You!

Slides: bit.ly/pymc-learn

Pymc-Learn: Practical Probabilistic Machine Learning in Python

By Daniel Emaasit

Pymc-Learn: Practical Probabilistic Machine Learning in Python

6 years ago
2,785

Daniel Emaasit

Data Scientist @HaystaxTech, Ph.D. Candidate @UNLV, Bayesian Machine Learning Researcher, Organizer of Data Science Meetups. User of #PyMC3.

Pymc-Learn: Practical Probabilistic Machine Learning in Python

PyData Washington DC, 2018

November 17, 2018

There is a growing need

for

principled machine learning

by

non-ML specialists

Reasons for increased adoption of Probabilistic Modeling (1/3)

Reasons for increased adoption of Probabilistic Modeling (2/3)

Reasons for increased adoption of Probabilistic Modeling (3/3)

Gaussian Process in PyMC3

Scikit-learn

Pymc-learn

Pymc-learn

Try it Online

Thank You!

Pymc-Learn: Practical Probabilistic Machine Learning in Python

More from Daniel Emaasit