Micro-Learning Task
Uncertainty decomposition in Bayesian Neural Networks
Pedagogy of Higher Education Course (Summer, 2020)
Presented by:
Pavel Temirchev
Instructor:
Magnus Gustafsson
Teaching Assistants:
Dina Bek
Aysylu Askarova
Yash Madhwal
Why I've chosen this topic?
- I've listened to an online MSc degree course on Uncertainty Quantification.
- And it was quite unclear on the Uncertainty Decomposition part.
- The author used formal definitions to explain the phenomenon:
\text{Total Uncertainty}(x) = \mathcal{H} \big[ \mathbb{E}_{p(\mathcal{W}|\mathcal{D})}\, p(y|x, \mathcal{W}) \big]
\text{Aleatoric Uncertainty}(x) = \mathbb{E}_{p(\mathcal{W}|\mathcal{D})}\, \mathcal{H} \big[ \,p(y|x, \mathcal{W}) \big]
\text{Epistemic Uncertainty}(x) = \text{Aleatoric Uncertainty}(x) - \text{Total Uncertainty}(x)
- Aleatoric Uncertainty is due to the irreducible randomness of the modeled process.
- Epistemic Uncertainty is caused by the lack of data during training.
This phenomenon can be expressed clearer and in a more involving manner!
Plan of the lecture:
-
Reminder: Neural Networks
-
Bayesian Neural Networks - how they work?
-
Uncertainty decomposition: epistemic and aleatoric uncertainty
-
How to use this knowledge in practice?
Reminder: Neural Networks
Neural Network:
\text{oracle}(x|\mathcal{W})
x \text{ (e.g. image)}
\text{probability distribution on } y
DOG
CAT
FROG
GIRAFFE
FISH
What are the most probable and the second most probable class labels in this example?
NN's parameters - "weights"
Bayesian Neural Networks (aka Bayesian Ensemble)
By definition:
p(y|x) = \mathbb{E}_{p(\mathcal{W}|\mathcal{D})}\, \text{oracle}(x|\mathcal{W})
p(y|x) = \mathbb{E}_{p(\mathcal{W}|\mathcal{D})}\, \text{oracle}(x|\mathcal{W}) \approx \frac{1}{N} \sum_{i=0}^N \,\text{oracle}(x|\mathcal{W}_i)
An average over an (infinite) number of oracles!
NN - single oracle:
BNN - ensemble of oracles:
Bayesian Ensembling
x
DOG
CAT
FROG
DOG
CAT
FROG
DOG
CAT
FROG
Oracle's predictions
\text{oracle}(x|\mathcal{W}_1)
\text{oracle}(x|\mathcal{W}_2)
\text{oracle}(x|\mathcal{W}_3)
DOG
CAT
FROG
Ensemble prediction
NOTE: How to measure the uncertainty of a prediction?
- low uncertainty
- high uncertainty
Please, help me!
Is the prediction uncertainty high or low?
DOG
CAT
FROG
And for this one?
DOG
CAT
FROG
Entropy \(\mathcal{H}\) is a measure of uncertainty (assume, you can compute it):
\mathcal{H} \big(\text{oracle}(x|\mathcal{W})\big) = \text{high} \rightarrow \texttt{Uncertainty is high}
\mathcal{H} \big(\text{oracle}(x|\mathcal{W})\big) = \text{low} \rightarrow \texttt{Uncertainty is low}
Certainty about uncertainty
x
DOG
CAT
FROG
\text{oracle}(x|\mathcal{W}_3)
DOG
CAT
FROG
\text{oracle}(x|\mathcal{W}_2)
DOG
CAT
FROG
\text{oracle}(x|\mathcal{W}_1)
DOG
CAT
FROG
Ensemble prediction
high
high
Averaged oracle's uncertainty:
\text{Aleatoric Uncertainty}(x) = \frac{1}{N} \sum_{i=0}^N \, \mathcal{H} \big[ \,\text{oracle}(x|\mathcal{W}_i) \big]
Ensemble uncertainty:
\text{Total Uncertainty}(x) = \mathcal{H} \big[ \, \frac{1}{N} \sum_{i=0}^N\,\text{oracle}(x|\mathcal{W}_i) \big]
Uncertainty about uncertainty
x
DOG
CAT
FROG
Ensemble prediction
\text{oracle}(x|\mathcal{W}_3)
DOG
CAT
FROG
\text{oracle}(x|\mathcal{W}_2)
DOG
CAT
FROG
\text{oracle}(x|\mathcal{W}_1)
DOG
CAT
FROG
\text{Aleatoric Uncertainty}(x)
\text{Total Uncertainty}(x)
?
>
\text{Epistemic Uncertainty}(x) = \text{Total Uncertainty}(x) - \text{Aleatoric Uncertainty}(x)
How to use Aleatoric Uncertainty?
How to use Epistemic Uncertainty?
-
NN oracles are forced to minimize their aleatoric uncertainty during training.
-
If aleatoric uncertainty is high, then the certain answer is rather impossible (process is stochastic)
-
Humans are not trained as neural nets. Humans commonly have high aleatoric uncertainty just because.
-
Commonly, high epistemic uncertainty is caused by a completely new input X.
-
Oracles had not seen anything similar before, so they make different predictions.
-
You can add epistemically uncertain input to the training dataset in order to improve your ensemble!
Uncertainty Decomposition: Example #1
Did the birds evolve from dinosaurs?
Uncertainty Decomposition: Example #2
How tall Napoleon was?
Uncertainty Decomposition: Example #3
What's the weather will be today at 10:00?
Uncertainty Decomposition: Comparison
What is the difference between BIRDS and (WEATHER, NAPOLEON) experiments?
What you can say about it?
Please, help me with your feedback: https://forms.gle/ri455uJWzqyeAWAZ7
Thank you for your attention!
And learn more about birds and dinosaurs!
Micro-Learning Temirchev Pavel
By cydoroga
Micro-Learning Temirchev Pavel
- 581