Machine Learning in real life

From old buzzwords to new buzzwords

Nastasia Saby

@saby_nastasia

Blog: https://mlinreallife.github.io/

Konecranes

#ML, #Craft, #Production

@saby_nastasia

Blog Machine Learning in real life: https://mlinreallife.github.io/

A lot of buzzwords that are only the tip of the iceberg.

Current buzzwords:

Big Data
Artificial intelligence
Deep learning
Artificial neural networks

Are they still valid?

Big Data

Big volumetry, diverse data

Machine learning can work with small data

Transfer learning reduces the need for big data

Then, is Big data still a valid buzzword?

It depends on the projects

Sometimes very important

Sometimes not so important

When it's important be prepared

Big data is hard

Artificial intelligence

Replace a cognitive process by a program

2 kinds of AI: Strong and weak

Strong: when you're able to imitate totally a human

#Science fiction.

Weak or specialised: when you're able to imitate a defined cognitive process

Example of weak AI: predict breakdowns for buses

Then is Artificial intelligence still a valid buzzword?

Weak artificial intelligence YES

Strong artificial intelligence is overrated and more for literature and cinema.

#semi-overrated

Deep learning

When you have a model with different layers

Example

3 hidden layers with an articificial neural network

Deep Learning is useful for unstructured data such as computer vision, NLP

Sometimes, Deep Learning is even useful for structured data such as tables

You don't always need it

Shallow learning is often enough

Then, is Deep Learning still a valid buzzword?

Yes, but for many projects, you don't need it

No need to be an expert in DL to work in data science

#semi-overrated

Artificial neural networks

A metaphor that refers to the way a brain works

Some neurons that are linked to transform inputs to outputs

Deep learning is not always a neural network

You can have different classifiers and do an aggregation

Then, is Artificial neural networks still a valid buzzword?

Yes, but in many companies, you don't need it

No need to be an expert to work in data science

#semi-overrated

Current buzzwords:

Big Data
Artificial intelligence
Deep learning
Artificial neural networks

Are they still valid?

They are still valid, but some of them are too overrated

They don't represent the reality of the majority of data science projects

When you start data science, you think you will do 20% of deep learning and 80% of shallow learning

Truth = 10% of machine learning, 90% of data cleaning, infrastructure, etc

ML Code is small.

But ML Code has a big influence on all the process: monitoring, data collection, etc

Steps of a project:

1. Ingestion

2. Data cleaning

3. Feature engineering

4. Model

5. Validation

6. Deployment

7. Monitoring

Ingestion

Schema with different sources, different formats

Data cleaning

Filter, Imputation

Feature engineering

Extract values from data

Join, enrich, computation

Model

Choose the best one

Validation

Offline: Test and train datasets

Online: Real usage, AB Tests, Canary Testing

Which metric: accuracy, precision, a customised one?

Deployment

Automatisation, API, Dashboard with Data Viz, included in another product

Monitoring

Classical monitoring + specific monitoring for deep learning

New buzzwords for me:

Feature engineering

Intepretability

Data Drift

Interpretability

The ability to understand the decisions of your model

If you want to differentiate dogs from wolves, be careful that the model is not learning according to the environment of the animal.

Grass = dog

Snow = wolf

Interpretability, what for?

Ethics

Trust

Marketing

Debug

Interpretability is sometimes not enough

Example of breakdowns of buses:

Investigate a crime that has not happened yet is hard

You need more than a prediction

In this case you need explainability:

the potential root cause of the breakdown for instance

Data Drift:

Data change all the time and can impact the performance of your model

Cinema

Toilet paper

Sales before the lockdown

Sales during the lockdown

Cinema

Toilet paper

Thank you

Any questions?

Nastasia Saby

@saby_nastasia

Blog: https://mlinreallife.github.io/

ML in real life

By nastasiasaby

Machine Learning in real life

Thank you

ML in real life

More from nastasiasaby