Why Should I Trust You?
Image credit: Interpretable ML Book
Introduction
Image credit: LIME paper video
Areas of Interpretability
Algorithm Transparency
Scope of interpretation
Local Interpretation
Global Interpretation
Model restriction
Blackbox models
Restricted models
Related Work
L
ocal
I
ntereptable
M
odel-Agnostic
E
xplanations
How does LIME work?
Objective Function Formulazition
\text{explanation(x)} = \text{argmin}_{g \in \mathcal{G}} L(f, g, \pi_x) + \Omega(g)
\Omega(g) \to \text{Complexity of model } g
\mathcal{G} \to \text{set of all possible models}
\pi_x \to \text{a neighbourhood of } x
L \to \text{ Loss function for approximation}
Some LIME Samples
Proposed Method
Interpretability Via Model Extraction
x \in \mathcal{X}
f
y \in \mathcal{Y}
x \in \mathcal{X}
T
T(x) \approx y
Algorithm Properties
Global
Blackbox
T as a Decision Tree
But Decision Tree Overfits!
Solution: Use Active Learning
The more samples used
The more generalized the tree
Model Extraction Algorithm
1. Input Distribution
\text{General Gaussian Mixture Model}
+
X_{train} \text{ dataset used for training } f
EM
\mathcal{P} \text{ distribution}
2. Exact Decision Tree
\text{Define } T^* \text{ such that } \\ \forall x; T^*(x) = f(x)
Gain(f,C_N) = \\ 1 - \sum_{y \sim \mathcal{Y}}{Pr_{x \sim P}[f(x) = y \, | \, C_N]}
3. Estimated DT
\text{Approximate } Gain(f, C_N) \text{ using n } i.i.d \text{ samples}
\text{Simplify } C_N = (s_1 \le x_1 \le t_1) \cdots (s_d \le x_d \le t_d)
pdf_{\mathcal{P} \, | \, C_N}(x) \varpropto \sum_{j=1}^{k}{\phi_j \prod_{i=1}^{d}{\mathcal{P}_{\mathcal{N}(\mu_{ji}, \sigma_{ji})|(s_i \le x \le t_i)}(x_i)}}
3. Estimated DT Cont.
\text{Using } pdf_{\mathcal{P} \, | \, C_N}(x) \\ \text{ we can calculate probability of each component of } \mathcal{P}
x \sim \mathcal{P} \, | \, C_N \\ (j \sim Categorical(\tilde{\phi})) \wedge (x \sim \mathcal{N}(\mu_{ji}, \sigma_{ji}) \, | \, (s_i \le x_i \le t_i))
4. Theoretical Gaurentee
Pr_{x \sim \mathcal{P}}{[T(x) \ne T^*(x)]} \le \epsilon \text{ has probability at least } 1-\delta
\text{As } n \to \infty \text{, } T \to T^* \text{. In other words, for all } \epsilon , \delta:
Sample Visualizations
Conclusion
Comparision with CART
Implementation
Reference
Bastani, O., Kim, C., Bastani, H. Interpreting Blackbox Models via Model Extraction. 2017
Breiman, L., Friedman, J., Stone, C. J., and Olshen, R. A. Classification and regression trees. CRC press, 1984.
Ribeiro, M.T. Singh, S. Guestrin C. "Why Should I Trust You?": Explaining the Predictions of Any Classifier, 2016
Thanks!
Here's the potato
Made with Slides.com