Dennis Collaris

Computer Science & Engineering

Explainability of
Machine Learning models

Graduation project

Explainability of
Machine Learning models

Machine Learning

 

Cat!

Cat!

Cat!

Bunny!

Bunny!

Bunny!

Training

  Cat 🐈

  Bunny 🐰

Classification

Decision Tree

Flappy ears?

Bunny 🐰

Yes

No

Wiggles nose?

Yes

No

Bunny 🐰

Cat 🐈

Use case: Fraud detection

for sick leave insurances

Difficult problem

  Non-fraud

Insurance policy

Insurance policy

  Fraud

Classification

Duration illness < 14 days

Non-fraud

Yes

No

Premium percentage < 5%

Yes

No

Non-fraud

Fraud

Decision Tree

Decision Tree

Random Forest

Random Forest

Random Forest Ensemble

7,582,365

decisions!

Main question:

"How can we analyze the choices the model makes for fraud detection of a specific case, and display them in a comprehensible manner?"

Insurance policy

  Fraud

Explanations

+

Why?

  Fraud

Currently: Black box

Policy

Fraud

Model

Non-fraud

Policy

Fraud

Model

Non-fraud

Because.. 

        Duration illness > 200 days

           Premium rate > 5%

Goal: White box

Global / Local

Literature

Literature

Structural

visualization

Model

simplification

Feature analysis

Literature

Structural

visualization

Model

simplification

Feature analysis

Feature importance

Feature 

interaction

Sensitivity

analysis

Feature importance

Feature importance

Feature importance

Feature importance

Sensitivity analysis

Sensitivity analysis

Policy Duration illness Premium rate
ANP128         days 5%

300

250

200

150

100

50

1

0

1

300​

200

100

0

Duration illness

Fraud?

Fraud

Non-fraud

Model simplification

Model simplification

Model simplification

Dashboards

Questions?

Thesis presentation (second backup)

By iamdecode

Thesis presentation (second backup)

  • 27