Under the Hood of Black-Box Models
Ahcène Boubekki
UCPH
Pioneer Centre
vision⤴
Example
Example

What did you see?
Where did you look?
Where were
the eyes?
How many eyes
were there?

How would you explain this image
using 5 concepts?
How would you decompose this image?
© Jagdeep Rajput
How to see what
a model sees?
Lightweight? Interpretable?
How to see what a model sees?


RISE

XRAI

GradCAM

LRP

IG





Dingo or Lion
These do not answer directly
what does the model see?
Deep Dream?
What makes it more a tiger than a tiger?
Too slow, impossible to train, not really useful.
What is important for the prediction?
Inconsistent, difficult to read, objective unclear.
Saliency Maps?
How is the neighborhood in the embedding?
Inspection of the embedding, "biaised" justification..
Prototypes/Concepts?
Counterfactual?
What should I change to change class?
Tricky to compute, but nice!
Tambako The Jaguar
Charles James Sharp
AI generated@FreePik
© Disney
How to see what a model sees?
Standard Image Classifier

Encoder
convolutions, pooling, non-linearity, skip-connections, attention, etc.
Classifier
Single linear layer... eventually a softmax
k-means
k=5








k=20
Same Color
Same Representation
Same Cluster
Same Meaning

k=10
k-means
How to see what a model sees?
Seems like a déjà-vu?





One object at a time
Limited to 3 directions
For K=3,
PCA and k-means are similar
K-means provides
some hierarchy!
Siméoni, Oriane, et al. "Dinov3." arXiv preprint arXiv:2508.10104 (2025).
Siméoni, Oriane, et al. "Dinov3." arXiv preprint arXiv:2508.10104 (2025).
Deng, Jia, et al. "Imagenet: A large-scale hierarchical image database." 2009 IEEE CVPR, 2009.
What can we
do with it?
Connect Explanations and Semantics

Replace class-wise explanations



Wah, Catherine, et al. "The caltech-ucsd birds-200-2011 dataset." (2011).
Concept Extraction
Concept Explanation: Manually?

Wah, Catherine, et al. "The caltech-ucsd birds-200-2011 dataset." (2011).





Edge Effect
Larger Than Expected

Concept Explanation: NAVE + CAM

Object Localization
NAVE always improves MaxBox
but not all thresholded IoU
Always improves the most difficult IoU@70%

+
+
+
+
+
+
+
++
+
+
+
-
+
+
+
-
-
-
+
+
Annotation Masks






Model train for binary classif.
All lesions recovered
Can we use NAVE for medical annotations?
You need well performing model!



Nguyen, H. Q., et al., VinDr-CXR: An open dataset of chest X-rays with radiologist's annotations. Scientific Data, 9(1), 429, (2022).
Shortcut Saturation Inspection



Wang, X., et al., ChestX-ray8: Hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. CVPR, (2017).
What else?
In-Distribution Intervention
How to hide an object?
Swap clusters
Enlarge another cluster
Swap between images

Class Score:
Fish: 6.8
Bike:-0.7
Fish: 4.1
Bike: 4.3

Class Score:
Bike: 5.3
Dog: 3.8
Bike: 5.9
Dog: 6.9

Class Score:
Bike: 5.3
Dog: 3.8
Bike: 1.6
Dog: 5.9
In-Distribution Intervention



Input
Explanation without watermark
Explanation with watermark

Original
Repaired


In-Distribution Intervention
Pneumonia?

No | Yes
-29.2 28.5
No | Yes
4.6 -4.6

No | Yes
-29.2 28.5
No | Yes
4.6 -4.6

No | Yes
-29.2 28.5
No | Yes
4.6 -4.6

No | Yes
-29.2 28.5
No | Yes
3.2 -3.3
Intervention on
ZERO
SCALING
RANDOM
SWAP
Test set: No watermark
Yes
No
Yes
No
Truth
Pred.
Accuracy: 26.7%
Test set: Watermarked
Yes
No
Yes
No
Truth
Pred.
Accuracy: 100%
Test set: Watermark
+ Swap
Yes
No
Yes
No
Truth
Pred.
Accuracy: 26.7%
Train set: Watermarked
Yes
No
Yes
No
Truth
Pred.
Accuracy: 100%
Unlearning?
Shortcut Unused!
Really? Just Vision?
Time series
ECG Explanation: Heart Condition









Wagner, P., et al., PTB-XL, a large publicly available electrocardiography dataset. Scientific Data, (2020).
ECG Explanation: Age
Text?
Not Yet
Summary
Summary
Do we capture semantics?
Yes
Can we see what the model sees?
Yes
- Concept extraction
- Object localization
- Model inspection
- Annotations masks from labels
- Shortcut saturation
What can we do with it?
- Feature attribution?
- Causality?
- Concept injection?
- Any other idea?
What else?
Under the Hood of Black-Box Models
Ahcène Boubekki
UCPH
Pioneer Centre
Under the Hood of Black-Box Models
By ahcene
Under the Hood of Black-Box Models
- 40