
Dennis Collaris
Computer Science & Engineering
Explainability of
Machine Learning models
Graduation project
Explainability of
Machine Learning models
Machine Learning



Cat!
Cat!
Cat!



Bunny!
Bunny!
Bunny!
Training


→ Cat 🐈
→ Bunny 🐰
Classification
Decision Tree
Flappy ears?
Bunny 🐰
Yes
No
Wiggles nose?
Yes
No
Bunny 🐰
Cat 🐈
Use case: Fraud detection
for sick leave insurances

Difficult problem


→ Non-fraud
Insurance policy

Insurance policy
→ Fraud
Classification
Duration illness < 14 days
Non-fraud
Yes
No
Premium percentage < 5%
Yes
No
Non-fraud
Fraud
Decision Tree
Decision Tree

Random Forest



Random Forest
Random Forest Ensemble
7,582,365
decisions!
Main question:
"How can we analyze the choices the model makes for fraud detection of a specific case, and display them in a comprehensible manner?"

Insurance policy
→ Fraud
Explanations
+
Why?
→
→ Fraud
Currently: Black box
Policy
Fraud
Model
Non-fraud
Policy
Fraud
Model
Non-fraud
Because..
Duration illness > 200 days
Premium rate > 5%

Goal: White box
Global / Local
Literature
Literature
Structural
visualization
Model
simplification
Feature analysis
Literature
Structural
visualization
Model
simplification
Feature analysis

Feature importance
Feature
interaction
Sensitivity
analysis
Feature importance
Feature importance




Feature importance

Feature importance




Sensitivity analysis
Sensitivity analysis
Policy | Duration illness | Premium rate |
---|---|---|
ANP128 | days | 5% |
300
250
200
150
100
50
1
0
1
300
200
100
0
Duration illness
Fraud?
Fraud
Non-fraud
Model simplification
Model simplification



Model simplification
Dashboards
Questions?
Thesis presentation (second backup)
By iamdecode
Thesis presentation (second backup)
- 27