ADAR 2020

**Ville Tuulos**

Machine Learning Infrastructure @ Netflix

Infrastructure Stack for Modern Data Science

a business

problem

predict

churn

a model

to predict

churn

data

a model

to predict

churn

data

model

data

transforms

data

model

data

transforms

results

data

model

data

transforms

results

compute

data

model

data

transforms

results

compute

schedule

action

data

data

transforms

results

compute

schedule

action

data

audits

model

model

audits

data

data

transforms

results

compute

schedule

action

data

audits

model

model

audits

data

transforms

data

audits

model

model

audits

versioning & tracking

Screenplay Analysis Using NLP

Fraud Detection

Title Portfolio Optimization

Estimate Word-of-Mouth Effects

Incremental Impact of Marketing

Classify Support Tickets

Predict Quality of Network

Content Valuation

Cluster Tweets

Intelligent Infrastructure

Machine Translation

Optimal CDN Caching

Predict Churn

Content Tagging

Optimize Production Schedules

Infrastructure Stack for Modern Data Science

Model Development |

Feature Engineering |

Model Operations |

Versioning |

Architecture |

Orchestration |

Compute |

Data |

Model Development |

Feature Engineering |

Model Operations |

Versioning |

Architecture |

Orchestration |

Compute |

Data |

Infrastructure Stack for Modern Data Science

**How much data scientist cares**

**How much data scientist cares**

**How much infrastructure is needed**

Model Development |

Feature Engineering |

Model Operations |

Versioning |

Architecture |

Orchestration |

Compute |

Data |

**Human-Centric** Infrastructure Stack for Modern Data Science

**From Prototype to Production And Back**