Real World Problem

But also want to know the strategy to make some differences (a.k.a. modify the sample)

When applying a classifier,

sometimes we not only just want to know whether a sample belongs to a class

First step

Problem Definition

Random Forest

RF is $C(X) = \sum_{t=1}^{T} \lambda_t f_t (\textbf{X})$ here $f_i(\cdot)$ is a tree in the forest.

Terminology

Let $leaves(t)$ be the set of leaves or terminal nodes of tree t.

Let $splits(t)$ denote the set of splits of tree t (non-terminal nodes).

Let $left(s)$ be the set of leaves that are accessible from the left branch, and same as the $right(s)$

let $V(s) \in \{1, . . . , n\}$ denote the variable that participates in split s,

and let $C(s)$ denote the set of values of variable i that participate in the split query of s.

the observation falls in exactly one of the leaves of each tree t

\sum_{\ell \in \textbf{leaves}(t)}y_{t,\ell} = 1, \forall t \in \{1, ..., T\}

\sum_{\ell \in \textbf{leaves}(t)}y_{t,\ell} = 1, \forall t \in \{1, ..., T\}

if 1 observation falls into the a sub-tree, then no observation could fall into the other part of sub-tree

\sum_{\ell \in \textbf{left}(s)} y_{t, \ell} \leq \sum_{j \in \textbf{C}(s)}x_{\textbf{V}(s), j}, \ \forall t \in \{1, ..., T\}, s \in \textbf{splits}(t)

\sum_{\ell \in \textbf{left}(s)} y_{t, \ell} \leq \sum_{j \in \textbf{C}(s)}x_{\textbf{V}(s), j}, \ \forall t \in \{1, ..., T\}, s \in \textbf{splits}(t)

\sum_{\ell \in \textbf{right}(s)} y_{t, \ell} \leq 1 - \sum_{j \in \textbf{C}(s)} x_{\textbf{V}(s), j}, \forall t \in \{1, ..., T\}, s \in \textbf{splits}(t)

\sum_{\ell \in \textbf{right}(s)} y_{t, \ell} \leq 1 - \sum_{j \in \textbf{C}(s)} x_{\textbf{V}(s), j}, \forall t \in \{1, ..., T\}, s \in \textbf{splits}(t)

Some additional constraints on x

\sum_{j=1}^{K_i} x_{i,j} = 1, \forall i \in \mathcal{C}

\sum_{j=1}^{K_i} x_{i,j} = 1, \forall i \in \mathcal{C}

x_{i,j} \leq x_{i, j+1}, \forall i \in \mathcal{N}, j \in \{1, ..., K_i - 1\}

x_{i,j} \leq x_{i, j+1}, \forall i \in \mathcal{N}, j \in \{1, ..., K_i - 1\}

x_{i,j} \in \{0, 1\}, \forall i \in \{1, ..., n\}, j \in \{1, ..., K_i - 1\},

x_{i,j} \in \{0, 1\}, \forall i \in \{1, ..., n\}, j \in \{1, ..., K_i - 1\},

All in one

\max_{\textbf{x},\textbf{y}} \sum_{t=1}^{T}\sum_{\ell \in \textbf{leaves}(t)} \lambda_t \cdot p_{t,\ell} \cdot y_{t, \ell}

\max_{\textbf{x},\textbf{y}} \sum_{t=1}^{T}\sum_{\ell \in \textbf{leaves}(t)} \lambda_t \cdot p_{t,\ell} \cdot y_{t, \ell}

subject \ to\ \sum_{\ell \in \textbf{leaves}(t)}y_{t,\ell} = 1, \forall t \in \{1, ..., T\}

subject \ to\ \sum_{\ell \in \textbf{leaves}(t)}y_{t,\ell} = 1, \forall t \in \{1, ..., T\}

\sum_{\ell \in \textbf{left}(s)} y_{t, \ell} \leq \sum_{j \in \textbf{C}(s)}x_{\textbf{V}(s), j}, \ \forall t \in \{1, ..., T\}, s \in \textbf{splits}(t)

\sum_{\ell \in \textbf{left}(s)} y_{t, \ell} \leq \sum_{j \in \textbf{C}(s)}x_{\textbf{V}(s), j}, \ \forall t \in \{1, ..., T\}, s \in \textbf{splits}(t)

\sum_{\ell \in \textbf{right}(s)} y_{t, \ell} \leq 1 - \sum_{j \in \textbf{C}(s)} x_{\textbf{V}(s), j}, \forall t \in \{1, ..., T\}, s \in \textbf{splits}(t)

\sum_{\ell \in \textbf{right}(s)} y_{t, \ell} \leq 1 - \sum_{j \in \textbf{C}(s)} x_{\textbf{V}(s), j}, \forall t \in \{1, ..., T\}, s \in \textbf{splits}(t)

\sum_{j=1}^{K_i} x_{i,j} = 1, \forall i \in \mathcal{C}

\sum_{j=1}^{K_i} x_{i,j} = 1, \forall i \in \mathcal{C}

x_{i,j} \leq x_{i, j+1}, \forall i \in \mathcal{N}, j \in \{1, ..., K_i - 1\}

x_{i,j} \leq x_{i, j+1}, \forall i \in \mathcal{N}, j \in \{1, ..., K_i - 1\}

x_{i,j} \in \{0, 1\}, \forall i \in \{1, ..., n\}, j \in \{1, ..., K_i - 1\},

x_{i,j} \in \{0, 1\}, \forall i \in \{1, ..., n\}, j \in \{1, ..., K_i - 1\},

y_{t, \ell} \geq 0, \forall t \in \{1,...,T\}, \ell \in \textbf{leaves}(t)

y_{t, \ell} \geq 0, \forall t \in \{1,...,T\}, \ell \in \textbf{leaves}(t)

\sum_{\ell \in \textbf{left}(s)} y_{t, \ell} \leq \sum_{j \in \textbf{C}(s)}x_{\textbf{V}(s), j}, \ \forall t \in \{1, ..., T\}, s \in \textbf{splits}(t)

\sum_{\ell \in \textbf{left}(s)} y_{t, \ell} \leq \sum_{j \in \textbf{C}(s)}x_{\textbf{V}(s), j}, \ \forall t \in \{1, ..., T\}, s \in \textbf{splits}(t)

\sum_{\ell \in \textbf{right}(s)} y_{t, \ell} \leq 1 - \sum_{j \in \textbf{C}(s)} x_{\textbf{V}(s), j}, \forall t \in \{1, ..., T\}, s \in \textbf{splits}(t)

\sum_{\ell \in \textbf{right}(s)} y_{t, \ell} \leq 1 - \sum_{j \in \textbf{C}(s)} x_{\textbf{V}(s), j}, \forall t \in \{1, ..., T\}, s \in \textbf{splits}(t)

Traverse the whole forest, O(k*2^n)!

\sum_{\ell \in \textbf{left}(s)} y_{t, \ell} \leq \sum_{j \in \textbf{C}(s)}x_{\textbf{V}(s), j}, \ \forall (t,s) \in \bar{\Omega}

\sum_{\ell \in \textbf{left}(s)} y_{t, \ell} \leq \sum_{j \in \textbf{C}(s)}x_{\textbf{V}(s), j}, \ \forall (t,s) \in \bar{\Omega}

\sum_{\ell \in \textbf{right}(s)} y_{t, \ell} \leq 1 - \sum_{j \in \textbf{C}(s)} x_{\textbf{V}(s), j}, \forall (t,s) \in \bar{\Omega}

\sum_{\ell \in \textbf{right}(s)} y_{t, \ell} \leq 1 - \sum_{j \in \textbf{C}(s)} x_{\textbf{V}(s), j}, \forall (t,s) \in \bar{\Omega}

\sum_{\ell \in \textbf{left}(s)} y_{t, \ell} \leq \sum_{j \in \textbf{C}(s)}x_{\textbf{V}(s), j}, \ \forall t \in \{1, ..., T\}, s \in \textbf{splits}(t)

\sum_{\ell \in \textbf{left}(s)} y_{t, \ell} \leq \sum_{j \in \textbf{C}(s)}x_{\textbf{V}(s), j}, \ \forall t \in \{1, ..., T\}, s \in \textbf{splits}(t)

\sum_{\ell \in \textbf{right}(s)} y_{t, \ell} \leq 1 - \sum_{j \in \textbf{C}(s)} x_{\textbf{V}(s), j}, \forall t \in \{1, ..., T\}, s \in \textbf{splits}(t)

\sum_{\ell \in \textbf{right}(s)} y_{t, \ell} \leq 1 - \sum_{j \in \textbf{C}(s)} x_{\textbf{V}(s), j}, \forall t \in \{1, ..., T\}, s \in \textbf{splits}(t)

Theorem

\delta_{t,s} = max \{ \max_{\ell \in left(s)} p_{t,\ell} - \min_{\ell \in left(s)} p_{t,\ell}, \max_{\ell \in right(s)} p_{t,\ell} - \min_{\ell \in right(s)} p_{t,\ell} \}

\delta_{t,s} = max \{ \max_{\ell \in left(s)} p_{t,\ell} - \min_{\ell \in left(s)} p_{t,\ell}, \max_{\ell \in right(s)} p_{t,\ell} - \min_{\ell \in right(s)} p_{t,\ell} \}

\Delta_t = \max_{s \in splict(t,d)} \delta_{t,s}

\Delta_t = \max_{s \in splict(t,d)} \delta_{t,s}

Z^*_{MIO,d} - \sum^T_{t=1} \lambda_t \Delta_t \leq Z_d \leq Z^*_{MIO} \leq Z^*_{MIO,d}

Z^*_{MIO,d} - \sum^T_{t=1} \lambda_t \Delta_t \leq Z_d \leq Z^*_{MIO} \leq Z^*_{MIO,d}

Optimization of Tree Ensembles

Real World Problem

Examples

First step

Problem Definition

How?

Focus on single model

Random Forest

Random Forest

Example from Iris dataset

How to formulate the problem?

Terminology

Object function

Constraints

Intermediate variable

the observation falls in exactly one of the leaves of each tree t

if 1 observation falls into the a sub-tree, then no observation could fall into the other part of sub-tree

Some additional constraints on x

the indicator y must be in {0, 1}

All in one

Approximation

It's quite time-consuming to solve the original problem

Traverse the whole forest, O(k*2^n)!

Idea: what if we do not search to the deepest of the tree?

First define

Proposion

Theorem

Experiments

Experiments

Experiments