PV226: Intro to ML

Machine Learning Tasks

Machine

Learning

Unsupervised

Learning

Supervised

Learning

Reinforced

Learning

Unsupervised

Learning

Clustering

Dimensionality Reduction

Targeted

Marketing

Recommender Systems

Customer Segmentation

Meaningful

Compression

Structure

Discovery

Feature

Elicitation

Big Data Visualizations

Supervised

Learning

Classification

Prediction

Image Classification

Fraud Detection

Customer Retention

Diagnostics

Natural Language Processing

Advertising Popularity Prediction

Weather Forecast

Marketing Forecast

Life Length Prediction

Population Growth Prediction

Reinforced

Learning

Optimisation

Aircraft Wing Modeling

Game AI

Real Time Decisions

Skill

Acquisition

Learning Tasks

Robot Navigation

PCB Layouting

Catalogue Planning

Machine Learning Project

Step 1: Collecting Requirements

Warning: Especially in ML topics clients have only rough idea what they want. Often no idea at all.

Communication is key.  Guide them through the options what is possible.

Ask about their systems and technologies. How you will get the data?

A quick exploration using AutoML is good start.

Formalize requirements and agree on budget.

Never promise high accruacy!

Communication

The most critical part to build client's confidence in you.

Try to get direct contacts to people who understand business. Avoid mediators if possible.

Step 2: Understand business and domain

Step 3: Start prototyping

Get some data.

Never start with custom ML model.

High percentages are suspicious.

Jupyter Notebook is your best friend.

Step 4: Production datasources

Start collecting data

There are different  kinds of storages

Sometimes you have to query APIs

You have to aggregate data sources.

Target?

Why to use file storage?

Step 5: Build v1

Prototype Fast

Python is a great tool for prototyping

Explore data.

Understand visualizations available.

This is also something you can sell to your client. Often it can bring him high value.

Collect data

Explore

Train

Evaluate

Deploy

In perspective of whole system

Data

Classification

GA

GA

Prediction

Infrastructure

Visualizations

Dashboards

Use Case UI

Visualizations

Backoffice

APIs

APIs

Orchestrations

Scaling

Aggregation

Data Science

Frontend

FE for BE

Backend

You have v1. Deploy it for internal testing.

Harsh part comes.

Step 6: Production system

!

Avoid Python in production systems

  1. Multithreading in Python is hard
  2. It is too slow
  3. That means too costly to run
  4. Dynamic language
  5. Naming conventions in your APIs

Use cloud as much as possible

You must do async APIs

or use Websockets

Use compression

Find right deployment model

Step 7: ML ops

Summary

  1. Collect requirements
  2. Understanding business
  3. Prototyping
  4. Data pipelines
  5. Building v1
  6. Building production system
  7. ML ops

PV226: Intro d ML

By Lukáš Grolig

PV226: Intro d ML

  • 339