Ahcène Boubekki
UCPH, Denmark
Short Resume
Name:
Ahcène Boubekki
Background:
Mathematics
(complex geometry, algebraic geometry...)
Currently:
Post-Doc @ PTB/TU Berlin
PhD:
Analysis of User Behavior
Previously:
Post-Doc @ UCPH/Pioneer Centre for AI
Researcher @ TU Munich
Researcher/PhD @ DIPF/TU Darmstadt (~Frankfurt)
Researcher/PhD @ Leuphana (~Hamburg)
Post-Doc @ UiT Arctic University of Norway
Short Resume
Main Research Topics
1. Representation Learning
2. Explanability/Interpretability
1. Representation Learning
2. Explanability/Interpretability
Self-explainable Models
Model Inspection
Learning Representations
Representing for Learning
The mBook Project
Objectives
Evaluate the use of an electronic textbook for history in middle school
Bring new methods to Educational Science
The mBook
The mBook Project
Some statistics
From January 31st to July 11th 2017
2,197 sessions
400 users
195 pupils (537 sessions)
The mBook Project
Classical Approaches
Bi-variate factor analysis
Markov Chains
Contributions
Same Data
Different Models
Different Representations
Content Analysis
Periodic Behaviors
Scrolling Behaviors
Online Behaviors
Is there a correlation between content and motivation?
Do the pupils use the mBook the same way over the week?
Are scrolling patterns correlated with competencies?
What is the influence of the teacher?
Content Analysis
Model: Archetypal Analysis
Repr.: Frequency Vectors
How to use AA instead of factor analysis?
Periodic Behaviors
Model: Mixture of Markov Chains
Repr.: Discrete Sequences
Can we make MMCs temporally aware?
Scrolling Behaviors
Model: Infinite/Bayesian MMC
Repr.: Discrete Sequences
How to do data-driven the model selection?
Online Behaviors
Model: Infinite/Bayesian k-means
Repr.: Spatio-temporal Timeseries
Can we study sessions as spatio-temporal trajectories?
Trajectories and Online Behaviors
Is it relevant to study sessions as
spatio-temporal trajectories?
* Well-defined ~ invariant to subdivision
distinguishes prefixes
None satisfies all the properties
Let's build one!
Sequence of pages
↓
Path in the page graph
+
Timestamps
+
Metric on the page graph
↓
Spatio-temporal
trajectory
Which trajectory measure ?
Construction
Our Measure:
Digression about
with shift
without shift
K.V. Olesen, et al. "A Contextually Supported Abnormality Detector for Maritime trajectories." Journal of Marine Science and Engineering (2023)
Trajectories in same cluster tend to start in the same location
What if we allow a temporal shift ?
Trajectories and Online Behaviors
Positive Correlation ↔ High avg. ≈ More Freedom
Negative Correlation ↔ Low avg. ≈ Less Freedom
Statistically Significant
Activity Indicator
average distance between one pupil
and her classmates
number of page per minute
number of event per minute
Pupils perform better if they follow the teacher's style/instructions
Teacher A
Teacher B
Teacher B
Teacher A
Deep Clustering as a Unifying Method
Haven't we be doing the same thing?
Neural
Networks
Centroids
Neural Networks ?
| Model | Objects | Dis/similarity | Clustering |
|---|---|---|---|
| AA | Vectors | Euclidean + kernel | Centroids |
| MMC | Matrices | Probability | Centroids |
| Trajec. | Time series | Temporal simil. | Centroids |
Embedding
Affinity-based Clustering
We cannot make everyone happy !
Problem:
Group super heroes
Objective:
Everyone is happy
Minimize unhappiness
Affinity-based Clustering
0.6
0.4
0.3
0.9
0.1
0.6
0.7
0.8
0.9
0.1
0.4
0.2
0.1
0.9
0.1
0.2
0.3
0.7
0.5
0.8
0.9
0.5
0.4
0.3
Memory
expensive
How many
groups?
Where should we cut the graph?
✂
Affinity-based Clustering
Group together those that are clearly similar
Strategy:
and treat the rest as noise.
DBSCAN
Ester, Martin, et al. "A density-based algorithm for discovering clusters in large spatial databases with noise." kdd. Vol. 96. No. 34. 1996.
Affinity-based Clustering
Group together those that are clearly similar
Strategy:
Merge if one member is similar enough to one other member.
Until enough is not satisfied anymore.
Until 3 clusters are formed.
Agglomerative Clustering
Single-linkage
Different merging strategy,
different linkage
still queries the Affinity matrix
(N×N)
Affinity-based Clustering
Remarks:
Which similarity measure?
Euclidean distance is easy
Not always cluster vectors
Cost of the
affinity matrix
Compute over mini-batches
Might repeat computations
Objects don't move!
The decision borders move
Let's make the objects move!
Affinity-based Clustering
Euclidean distance is easy
Compute over mini-batches
Let's make the objects move!
Euclidean distance is easy
Compute over mini-batches
Let's make the objects move!
color
shape
What are we actually doing?
We learn a similarity measure
We learn a Kernel!
Feature map
Euclidean
Unknown
How do we guide the learning?
Affinity-based Deep Clustering
Let be a dataset that we want to cluster using a feature map .
We want that in the embedding space:
- similar objects are close to each other,
- dissimilar ones are far from each other.
For each datapoint , we have a set of positive examples
and of negative ones .
Triplet Loss
How do we get these sets?
Euclidean norm not good in practice
InfoNCE
Contrastive Learning
Augmentations are pulled closer
Other instances are pushed away
Centroid-based Clustering
Choose three representatives.
Strategy:
Group by similarity.
Update the representatives.
Continue until convergence.
k-Medoids
Can we learn k-means with a neural network?
If the representatives are not necessarily instances
k-means
Motivation
What did you see?
Where did you look?
Where were
the eyes?
How many eyes
were there?
How to see what a model sees?
RISE
XRAI
GradCAM
LRP
IG
Dingo or Lion
These do not answer directly
what does the model see?
Deep Dream?
What makes it more a tiger than a tiger?
Too slow, impossible to train, not really useful.
What is important for the prediction?
Inconsistent, difficult to read, objective unclear.
Saliency Maps?
How is the neighborhood in the embedding?
Inspection of the embedding, "biaised" justification..
Prototypes/Concepts?
Counterfactual?
What should I change to change class?
Tricky to compute, but nice!
How to see what a model sees?
Standard Image Classifier
Encoder
convolutions, pooling, non-linearity, skip-connections, attention, etc.
Classifier
Single linear layer... eventually a softmax
clustering
k=10
k=5
How to see what a model sees?
Standard Image Classifier
Encoder
convolutions, pooling, non-linearity, skip-connections, attention, etc.
Classifier
Single linear layer... eventually a softmax
clustering
k=10
k=5
How to see what a model sees?
Seems like a déjà-vu?
One object at a time
Limited to 3 directions
For K=3,
PCA and k-means are similar
K-means provides
some hierarchy!
What can we explain?
Explain explanations
IG
LRP
GradCAM
RISE
XRAI
color gradient ~ rank
What can we explain?
Connect Concepts and Semantics
What can we explain?
ECG Explanation
What can we explain?
Annotation Masks
Model train for binary classif.
All lesions recovered
Can we use NAVE for medical annotations?
You need well performing model!
What can we explain?
Inspect Shortcuts Saturation
What else can we do?
Unlearning?
>>> Test set
>>> IMPUTATION: SWAP
>>> Accuracies
All LBL==0 LBL==1
Distribution 176 26.7 73.3
--------------------------------------
Nothing : 26.7 100.0 0.0
Watermark : 100.0 100.0 100.0
Wmk and Imput: 26.7 100.0 0.0
--------------------------------------What else can we do?
In distribution counterfactuals
Future Research
Theory
Applications
How to use NAVE for Fairness and Trustworthiness?
Ahcène Boubekki
UCPH, Denmark