This research received funding from the Flemish Government (AI Research Program)
Elia Van Wolputte & Hendrik Blockeel
KU Leuven and Leuven.AI, Belgium
DISCOVERY SCIENCE 2020
A typical ML-model solves this problem:
Given ,
what's ?
Given ,
what's ?
This is an idealized scenario:
if you do not know the exact task on beforehand, this paradigm breaks down.
Disadvantage
An ML-model which is
'robust to missing values' solves this problem:
Given ,
what's ?
Without knowing !
A typical ML-model solves this problem:
Given ,
what's ?
Given ,
what's ?
This is an idealized scenario:
if you do not know the exact task on beforehand, this paradigm breaks down.
Disadvantage
An ML-model which is
'robust to missing values' solves this problem:
Given ,
what's ?
Without knowing !
ML in industrial contexts
- sensor data
- sometimes, sensors break
- the ML-pipeline should not break because of a faulty sensor
Autocomplete in webforms:
- user fills in fields...
- ... in random order
- based on filled-in fields: suggestions need to be made
An ML-model which is
'robust to missing values' solves this problem:
Given ,
what's ?
Without knowing !
Iterative Approaches
(e.g. MissForest)
Fix missing values. Retrain. Fix missing values again. Repeat until converged.
MERCS
Train once. The multi-directional model can be used to predict any column.
Naive Imputation
Guess the missing values, e.g. substitute the mean or median
Probabilistic Graphical Models
Use probabilistic inference to infer the most likely values for the missing entries, based on the known values.
Naive Imputation
MERCS
Fast but inaccurate.
Slow, but accurate.
Accurate, but you need to retrain for every new instance that comes in.
Possible alternative for iterative approaches
without retraining.
Probabilistic Graphical Models
Iterative Approaches
(e.g. MissForest)
Compact representation:
Cf. Van Wolputte et al.,MERCS: Multi-directional Ensembles of Regression and Classification Trees, AAAI-18
Uni-directional model
e.g. decision tree
Multi-directional model
e.g. MERCS-model
Cf. Van Wolputte et al.,MERCS: Multi-directional Ensembles of Regression and Classification Trees, AAAI-18
Compact
representation
MERCS should handle any query
Prediction time
MISMATCH!
- Overcome mismatch
SOLUTION = BETTER PREDICTION STRATEGIES IN MERCS
- 2 MAIN IDEAS: ATTRIBUTE IMPORTANCE AND CHAINING
Training time
MERCS MODEL
- Learn a MERCS-model
- Queries unknown
- Use a MERCS-model
- Often no "perfect" tree for query!
QUERIES
PROBLEM
ASSUMPTION
IDEA
TRADEOFF
Which trees to use?
Trees with many missing inputs are likely to be mistaken.
Attribute Importance can quantify this effect.
Many trees vs. Good trees
IDEA
DEFINITION
CRITERION
Attribute importance is a way to quantify how appropriate a given tree is
Cf. Louppe et al., Understanding variable importances in forests of randomized trees, NeurIPS 2013
...or the original CART manual
How much does an attribute matter?
How much do the available attributes matter?
Baseline (RF)
Most-relevant attribute importance
q
Query:
f
1.0
0.8
0.2
0.8
0.2
c
0 + 0 + 0 = 0
0.8 + 0 + 0=0.8
0 + 0 +0.2=0.2
0.8
0.2
0.8 + 0 + 0=0.8
q
f
c
PROBLEM
ASSUMPTION
IDEA
TRADEOFF
Which trees to use?
Missing inputs are also predictable with MERCS itself
Chaining of component trees to answer a given query
Bottom-up vs. Top-Down
Cf. Read et al., Classifier chains for multi-label classification, ECMLPKDD 2009
Bottom-Up Chaining
Top-Down Chaining
Query:
Use most relevant models, given \(\{A_1, A_2\}\)
Use most relevant models, given \(\{A_1, A_2, A_4\}\)
Query:
Bottom-Up chaining
OK
baseline
most-revelant
chaining
PGM
TRAIN MODEL,
FIX MISSING VALUES
TRAIN MODEL
AGAIN,
FIX MISSING VALUES AGAIN
TRAINING DATA
TEST DATA (=QUERIES)
TRAIN
TEST
MISSFOREST
MISSFOREST
TRAINING DATA +
SINGLE QUERY
TRAIN
TEST
MERCS
QUERIES
TRAINING DATA
MERCS
MODEL
This research received funding from the Flemish Government (AI Research Program)
DISCOVERY SCIENCE 2020