Continuous Delivery
for Machine Learning

Bringing DevOps to the world of AI

Renato Cordeiro Ferreira

2019

Machine Learning
Pipeline

The basic process to learn from data

"Continuous Delivery is the ability to get changes of all types -- including new features, configuration changes, bug fixes, and experiments -- into production, or in the hands of uses, safely and quickly in a sustainable way."

-- Jez Humble and Dave Farley

Continuous Delivery

3 Axis of Change for ML

"Continuous Delivery for Machine Learning is a software engineering approach in which a cross-functional team produces machine learning applications based on code, data and models in small and safe increments that can be reproduced and reliably released at any time, in short adaptation cycles."

-- Danilo Sato, Arif Wider, Christoph Windheuser

Continuous Delivery for ML

"Continuous Delivery for Machine Learning is a software engineering approach in which a cross-functional team produces machine learning applications based on code, data and models in small and safe increments that can be reproduced and reliably released at any time, in short adaptation cycles."

-- Danilo Sato, Arif Wider, Christoph Windheuser

Continuous Delivery for ML

Software engineering approach:
It enables teams to efficiently produce high quality software

Cross-functional team:
Experts with different skill sets and workflows across data engineering, data science, machine learning engineering, development, operations, and other knowledge areas are working together in a collaborative way emphasising the skills and strengths of each team member

Continuous Delivery for ML

Producing software based on code, data and models:
All artifacts of the ML software production process require different tools and workflows that must be versioned and managed accordingly

Small and safe increments:
The release of the software artifacts is divided into small increments, which allows visibility and control around the levels of variance of its outcomes, adding safety into the process

Continuous Delivery for ML

Reproducible and reliable software release:
While the model ouputs can be non-deterministic and hard to reproduce, the process of releasing ML software into production is reliable and reproducible, leveraging automation as much as possible

Software release at any time:
It is important that the ML software could be delivered into production at any time. Even if organizations do not want to deliver software all the time, it should always be in a releasable state. This makes the decision about when to release it a business decision rather than a technical one

Continuous Delivery for ML

Short adaptation cycles:
Short cycles means development cycles are in the order of days or even hours, not weeks, months or even years. Automation of the process with quality built in is key to achieve this. This creates a feedback loop that allows you to adapt your models by learning from its behavior in production

Continuous Delivery for ML

"Continuous Delivery for Machine Learning is a software engineering approach in which a cross-functional team produces machine learning applications based on code, data and models in small and safe increments that can be reproduced and reliably released at any time, in short adaptation cycles."

-- Danilo Sato, Arif Wider, Christoph Windheuser

Intelligent Systems Workflow

Intelligent Systems Workflow

Technical Components of CD4ML

  1. Discoverable and Accessible Data
  2. Reproducible Model Training
  3. Model Serving
  4. Testing and Quality in Machine Learning
  5. Experiments Tracking
  6. Model Deployment
  7. Continuous Delivery Orchestration
  8. Model Monitoring and Observaibility

Discoverable and Accessible Data

Reproducible Model Training

Model Serving

Embedded Model

Model as a Service

Model as Data

1

2

4

3

5

6

Testing and Quality in Machine Learning

Experiments Tracking

Model Deployment

A

B

Multiple
Models

Shadow
Models

Competing
Models

Online
Models

Continuous Delivery Orchestration

Model Monitoring and Observability

Model inputs

Model interpretability outputs

Model outputs and decisions

User action and rewards

Model fairness

End-to-End CD4ML

References

Continuous Delivery for Machine Learning

By Renato Cordeiro Ferreira

Continuous Delivery for Machine Learning

Continuous Delivery for Machine Learning is a new set of practices documented by ThoughtWorks about how to bring DevOps principles to the world of AI.

  • 830