ADAR 2020
Ville Tuulos
Machine Learning Infrastructure @ Netflix
Infrastructure Stack for Modern Data Science
with
a business
problem
predict
churn
a model
to predict
churn
data
a model
to predict
churn
data
model
data
transforms
data
model
data
transforms
results
data
model
data
transforms
results
compute
data
model
data
transforms
results
compute
schedule
action
data
data
transforms
results
compute
schedule
action
data
audits
model
model
audits
data
data
transforms
results
compute
schedule
action
data
audits
model
model
audits
data
transforms
data
audits
model
model
audits
versioning & tracking
Screenplay Analysis Using NLP
Fraud Detection
Title Portfolio Optimization
Estimate Word-of-Mouth Effects
Incremental Impact of Marketing
Classify Support Tickets
Predict Quality of Network
Content Valuation
Cluster Tweets
Intelligent Infrastructure
Machine Translation
Optimal CDN Caching
Predict Churn
Content Tagging
Optimize Production Schedules
Infrastructure Stack for Modern Data Science
Model Development |
Feature Engineering |
Model Operations |
Versioning |
Architecture |
Orchestration |
Compute |
Data |
Model Development |
Feature Engineering |
Model Operations |
Versioning |
Architecture |
Orchestration |
Compute |
Data |
Infrastructure Stack for Modern Data Science
How much data scientist cares
How much data scientist cares
How much infrastructure is needed
Model Development |
Feature Engineering |
Model Operations |
Versioning |
Architecture |
Orchestration |
Compute |
Data |
Human-Centric Infrastructure Stack for Modern Data Science
From Prototype to Production And Back