TPOT: A Tree-based Pipeline Optimization Tool

Trang Lê


postdoctoral researcher @UPenn

amateur runner

About me

PhD Mathematics, Dec 2017, University of Tulsa

Laureate Institute for Brain Research

  • imaging​​​​, BrainAGE
  • use differential privacy to reduce
    overfitting in high-d biological data
  • pseudopotential for fractional-d


Computational Genetics Lab

Clean data

Select features









Preprocess features

Construct features

Select classifier

Optimize parameters

Validate model

Raw data


Typical pipeline









Open sourced AutoML tools

  • auto-sklearn (python)
    • Bayesian optimzation over a fixed 3-step ML pipeline
  • auto-Weka (java)
    • similar to auto-sklearn, built on top of Weka
  • (java w/ python, scala, R, web GUI)
    • basic data prep w/ grid/random search over ML algorithms
  • devol (python)
    • deep learning architecture search via GP

Randy Olson

  • DEAP

  • Objective:
    • maximize pipeline's CV classification performance
    • minimize pipeline’s complexity 
  • Pareto front with NSGA-II


Entire data set

Entire data set


Polynomial features

Combine features

Select k best features

Logistic regression

Multiple copies of the data set can enter the pipeline for analysis

Pipeline operators modify the features

Modified data set flows through the pipeline operators

Final classification is performed on the final feature set

Genetic programming

GP primitives Dataset selectors, Feature selectors & preprocessors, Supervised classifiers

Population sequences of pipeline operators


Mutation and crossover

(a) insertion mutation

(b) deletion mutation

(c) swap mutation

(d) substitution mutation

(e) crossover

TPOT configs

  • Default TPOT
  • TPOT light
  • TPOT sparse
  • TPOT-MDR (Multi-Directional Reduction)
  • Classification
  • Regression

TPOT Template

Mutation restriction

Complexity reformulation

  • Number of pipeline operators
    • Flexibility of each operator
    • Runtime
  • Number of features used in pipeline
  • Number of parameters
  • By accessing over-fitting: stability of the covariance of predictors, rank differences of importance metrics 

Integration with neural nets


  • preprocessing
  • scalability
  • computational expense


Thank you!

Jason Moore

Weixuan Fu