There is no difference in value generation

MLOPS &

DEVOPS

felipe f. rocha

INTRODUCTION

What is THE PROBLEM WITH TRADITIONAL ML WORKFLOW?

Machine Learning LifeCycle

SME

DATA ENG.

DATA ENG|SCIENTIST

DATA SCIENTIST

ML ENG.

Machine Learning LifeCycle

Subject Matter Expert (SME)

Data Engineer.

Data Scientist

Ops Engineer

  • Business questions, goals, KPIs.
  • Evaluate model perfomance
  • Algorithm select
  • Build, train, validate model
  • Hyperparameter tunning
  • Data Acquisition, data lake formation
  • Data wrangler, cleaning, transformation
  • ETL Pipelines
  • Model Operations
  • Deployment

But... what is production ready

Deployment gap

1 -Alghorithima - 2020 State of Enterprise Machine Learning

2 - D. Sculley; Gary Holt; Daniel Golovin; Eugene Davydov; Todd Phillips;Dietmar Ebner; Vinay Chaudhary; Michael Young; Jean-François Crespo; Dan Dennison. “Hidden Technical Debt in Machine Learning Systems”. Em: Google Inc (2015)

Hidden Technical Debt in Machine Learning System

Hidden Technical Debt in Machine Learning System

M = C (1 + i)^t

i = % de débitos técnicos

t = número de commits

https://www.youtube.com/watch?v=pqeJFYwnkjE

What about Devops?

Devops is CALMS

  • Integration
  • Delivery
  • Portability | reproducibility
  • Scale
  • Reliability
  • Collaboration
  • Security
  • Stream-aligned team
  • Platform team
  • Complicated-subsystem team
  • Enabling team

State of DevOps 2022*

fundamentals

“the extension of the DevOps methodology to include Machine Learning and Data Science assets as first-class citizens within the DevOps ecology”  - MLOps SIG

Principels

What guides it?

Principels

What guides it?

Establish

  • Common tools, and procedures.
  • Common language, ADRs

Version Control

  • Same version control
  • Define a common workflow,
  • What will be versioned

Performance

  • Distributed computing
  • Choose a system of isolation (containers)

Automation

  • Build Workflows, CI > CT > CD pipelines
  • Prefer pipelines rather then just models in production

Monitoring

  • Data (quality), features, distribution, Hardware, Statistics

Principels

What guides it?

MLOps must be a practice that considers:

  • language-;
  • framework-;
  • platform-;
  • infrastructure-agnostic.

 

How to implement this model

MLOPS is all about pipeliens

How to implement this model

MLOPS is all about pipeliens

MLops Stack Canvas

Tooling

What can I use to do my job...

Tools

  1. Kedro (pipelines)
  2. Airflow (ETL, trainers)
  3. AWS Step Functions
 

Job Orchestrators

  1. Databricks
  2. SageMaker (AWS)
  3. kubeflow (k8s)
  4. ML Studio (Azure)
  5. GCP Vertex
 

Platform

  1. Container (docker, podman etc)
 

Environment isolation

  1. Jenkins
  2. Clouds Builds
  3. Github Actions/Gitlab CI
 

CI

  1. Spinaker
  2. Github Ac/ Gitlab CI
  3. Azure DevOps
  4. Amazon CodeCatalyst
 

CD

  1. ML Flow
  2. Containers Registries with model in it
 

Model Registry / Metrics tracker

Usefull links

https://ml-ops.org/

https://github.com/cdfoundation/sig-mlops

https://pages.navigator.bcg.com/kp/df185690-b5d8-4b3a-bedf-67a92cdec790

 

MLOps

Felipe F. Rocha

rocha.felipe@bcg.com

 

 

OBRIGADO!!!

MLOps & DevOps

By Felipe Fonseca Rocha

MLOps & DevOps

This deck intents in expose the theme MLOps as extension of DevOps principles

  • 106