2. Learning

Fruit

flies

like

a

banana

Given Find
(x_i, y_i)
\mathbf{w}

A

N

V

D

N

Learning

Fruit

flies

like

a

banana

Given Find
(x_i, y_i)
\mathbf{w}

A

N

V

D

N

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Learning

Fruit

flies

like

a

banana

A

N

V

D

N

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Learning

\text{1. \textbf{Search} for the optimal }\hat{y}\text{ with parameters }\textbf{w}

Fruit

flies

like

a

banana

A

N

V

D

N

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Learning

\text{1. \textbf{Search} for the optimal }\hat{y}\text{ with parameters }\textbf{w}
\text{2. \textbf{Update} the parameters }\textbf{w}\text{ based on }\hat{y}\text{ and }y

Fruit

flies

like

a

banana

A

N

V

D

N

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Learning

\text{1. \textbf{Search} for the optimal }\hat{y}\text{ with parameters }\textbf{w}
\text{2. \textbf{Update} the parameters }\textbf{w}\text{ based on }\hat{y}\text{ and }y
\text{3. Return to 1 until satisfied}

Fruit

flies

like

a

banana

A

N

V

D

N

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

\text{a.}\;\;\hat{y_i} = \arg\max_y \mathbf{w}\cdot\mathbf{f}(x_i,y)
\text{b.}\;\;\mathbf{w}=\mathbf{w}-\big(\mathbf{f}(x_i,\hat{y_i})-\mathbf{f}(x_i,y_i)\big)

Structured Perceptron

Collins, M. (2002) Discriminative training methods for HMMs: Theory and experiments with perceptron algorithms.

\text{1. For }i = 1,2,\cdots,n
\text{2. Return to 1 until satisfied}

Fruit

flies

like

a

banana

A

N

V

D

N

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

(Sub-)Gradient Descent

\text{b.}\;\;\mathbf{w}=\mathbf{w}-\alpha(\mathbf{f}(x_i,\hat{y_i})-\mathbf{f}(x_i,y_i))
\text{a.}\;\;\hat{y_i} = \arg\max_y \mathbf{w}\cdot\mathbf{f}(x_i,y)
\text{b.}\;\;\mathbf{w}=\mathbf{w}-\big(\mathbf{f}(x_i,\hat{y_i})-\mathbf{f}(x_i,y_i)\big)
\text{1. For }i = 1,2,\cdots,n
\text{2. Return to 1 until satisfied}
\text{1. Sample }i\text{ from }\{1,2,\cdots,n\}
\text{a.}\;\;\hat{y_i} = \arg\max_y \mathbf{w}\cdot\mathbf{f}(x_i,y)
\text{2. Return to 1 until satisfied}

Fruit

flies

like

a

banana

A

N

V

D

N

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Loss Function

\text{b.}\;\;\mathbf{w}=\mathbf{w}-\alpha(\mathbf{f}(x_i,\hat{y_i})-\mathbf{f}(x_i,y_i))
\text{a.}\;\;\hat{y_i} = \arg\max_y \mathbf{w}\cdot\mathbf{f}(x_i,y)
\text{b.}\;\;\mathbf{w}=\mathbf{w}-\big(\mathbf{f}(x_i,\hat{y_i})-\mathbf{f}(x_i,y_i)\big)
\text{1. For }i = 1,2,\cdots,n
\text{2. Return to 1 until satisfied}
\text{1. Sample }i\text{ from }\{1,2,\cdots,n\}
\text{a.}\;\;\hat{y_i} = \arg\max_y \mathbf{w}\cdot\mathbf{f}(x_i,y)
\text{2. Return to 1 until satisfied}

Fruit

flies

like

a

banana

A

N

V

D

N

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Loss Function

\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\max_y{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}
+
-
\sum_i\Big(
\Big)
\min_{\mathbf{w}}

Fruit

flies

like

a

banana

A

N

V

D

N

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Loss Function

\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\max_y{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}
+
-
\sum_i\Big(
\Big)
\min_{\mathbf{w}}

Fruit

flies

like

a

banana

A

N

V

D

N

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Labeled Graph

\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\max_y{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}
+
-
\sum_i\Big(
\Big)
\min_{\mathbf{w}}
L

Fruit

flies

like

a

banana

A

N

V

D

N

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Loss Function

\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\max_y{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}
+
-
\sum_i\Big(
\Big)
\min_{\mathbf{w}}

Fruit

flies

like

a

banana

A

N

V

D

N

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Unlabeled Graph

\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\max_y{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}
+
-
\sum_i\Big(
\Big)
\min_{\mathbf{w}}
U

Learning

\mathbf{w}^{(k)}
\max
\mathbf{w}^{(k+1)}
L
L
U

Structured Perceptron

Fruit

flies

like

a

banana

A Probabilistic View

A

N

V

D

N

\max_{\mathbf{w}}\sum_{i}\log p(y_{i}|x_{i})

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Fruit

flies

like

a

banana

A Probabilistic View

A

N

V

D

N

\max_{\mathbf{w}}\sum_{i}\log\Big(\exp\big(\mathbf{w}\cdot\mathbf{f}({x_{i}},{y_{i}})\big)/\sum_{y'}\exp\big(\mathbf{w}\cdot\mathbf{f}({x_{i}},y')\big)\Big)

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Fruit

flies

like

a

banana

Conditional Random Field

A

N

V

D

N

\log{\sum_{y}\exp{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}}
\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\Big)
\Big(
-
+
\sum_i

Lafferty, J., McCallum, A., Pereira, F. C. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data.

\min_{\mathbf{w}}

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Fruit

flies

like

a

banana

A Comparison

A

N

V

D

N

\log{\sum_{y}\exp{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}}
\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\Big)
\Big(
-
+
\sum_i
\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\max_y{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}
+
-
\sum_i\Big(
\Big)
\min_{\mathbf{w}}
\min_{\mathbf{w}}

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

A Comparison

\log{\sum_{y}\exp{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}}
\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\Big)
\Big(
-
+
\sum_i
+
-
\sum_i\Big(
\Big)
\min_{\mathbf{w}}
\min_{\mathbf{w}}
\max_y{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}
\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\mathcal{Y}=\{1,3,4,8,18\}
\max_{y\in\mathcal{Y}}(y)=18
\log\sum_{y\in\mathcal{Y}}\exp(y)=18.0000465777..

A Comparison

\log{\sum_{y}\exp{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}}
\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\Big)
\Big(
-
+
\sum_i
+
-
\sum_i\Big(
\Big)
\min_{\mathbf{w}}
\min_{\mathbf{w}}
\max(a,b)
\log(\exp(a)+\exp(b))
\max_y{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}
\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)

Fruit

flies

like

a

banana

"Soft" Max

A

N

V

D

N

\log{\sum_{y}\exp{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}}
\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\Big)
\Big(
-
+
\sum_i
\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\max_y{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}
+
-
\sum_i\Big(
\Big)
\min_{\mathbf{w}}
\min_{\mathbf{w}}

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Fruit

flies

like

a

banana

A

N

V

D

N

\log{\sum_{y}\exp{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}}

"Soft" Max

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Fruit

flies

like

a

banana

Viterbi

\max_y\mathbf{w}\cdot\mathbf{f}(x,y) = \max_y\sum_j \mathbf{w}\cdot\mathbf{f}(x, [y^{j},y^{j+1}])

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Fruit

flies

like

a

banana

Forward

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

\log\sum_y\exp\mathbf{w}\cdot\mathbf{f}(x,y) = \log\sum_y\exp\sum_j \mathbf{w}\cdot\mathbf{f}(x, [y^{j},y^{j+1}])

Fruit

flies

like

a

banana

Forward

3.6
-1.1
-9.0
2.7

N

V

D

A

P

0.5
\log\sum_y\exp\mathbf{w}\cdot\mathbf{f}(x,y) = \log\sum_y\exp\sum_j \mathbf{w}\cdot\mathbf{f}(x, [y^{j},y^{j+1}])

Fruit

flies

like

a

banana

4.4
5.1
1.5
3.7

Forward

N

V

D

A

P

N

V

D

A

P

0.4
\log\sum_y\exp\mathbf{w}\cdot\mathbf{f}(x,y) = \log\sum_y\exp\sum_j \mathbf{w}\cdot\mathbf{f}(x, [y^{j},y^{j+1}])

Fruit

flies

like

a

banana

-0.6
2.2
9.1
1.7

Forward

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

3.1
\log\sum_y\exp\mathbf{w}\cdot\mathbf{f}(x,y) = \log\sum_y\exp\sum_j \mathbf{w}\cdot\mathbf{f}(x, [y^{j},y^{j+1}])

Fruit

flies

like

a

banana

-1.2
0.6
2.8
0.3

Forward

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

0.1
\log\sum_y\exp\mathbf{w}\cdot\mathbf{f}(x,y) = \log\sum_y\exp\sum_j \mathbf{w}\cdot\mathbf{f}(x, [y^{j},y^{j+1}])

Fruit

flies

like

a

banana

1.4
2.6
-0.8
1.2

Forward

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

0.3
\log\sum_y\exp\mathbf{w}\cdot\mathbf{f}(x,y) = \log\sum_y\exp\sum_j \mathbf{w}\cdot\mathbf{f}(x, [y^{j},y^{j+1}])

Fruit

flies

like

a

banana

Forward

\log\sum_y\exp\mathbf{w}\cdot\mathbf{f}(x,y) = \log\sum_y\exp\sum_j \mathbf{w}\cdot\mathbf{f}(x, [y^{j},y^{j+1}])

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

8.9

Fruit

flies

like

a

banana

Gradient

\log{\sum_{y}\exp{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}}
\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
+
\Big)
\Big(
\sum_i
\min_{\mathbf{w}}
-

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Fruit

flies

like

a

banana

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Backward

\log{\sum_{y'}\exp{\big(\mathbf{w}\cdot\mathbf{f}({x},{y}')\big)}}
\mathbb{E}_{p(y'|x)}[\mathbf{f}(x,y')]

Fruit

flies

like

a

banana

N

V

D

A

P

Backward

\mathbb{E}_{p(y'|x)}[\mathbf{f}(x,y')]
\log{\sum_{y'}\exp{\big(\mathbf{w}\cdot\mathbf{f}({x},{y}')\big)}}
2.1
0.5
-1.8
0.2
0.2

Fruit

flies

like

a

banana

N

V

D

A

P

N

V

D

A

P

Backward

\mathbb{E}_{p(y'|x)}[\mathbf{f}(x,y')]
\log{\sum_{y'}\exp{\big(\mathbf{w}\cdot\mathbf{f}({x},{y}')\big)}}
-1.2
0.6
2.8
0.3
0.1

Fruit

flies

like

a

banana

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Backward

\mathbb{E}_{p(y'|x)}[\mathbf{f}(x,y')]
\log{\sum_{y'}\exp{\big(\mathbf{w}\cdot\mathbf{f}({x},{y}')\big)}}
-0.2
2.6
1.8
1.3
0.1

Fruit

flies

like

a

banana

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Backward

\mathbb{E}_{p(y'|x)}[\mathbf{f}(x,y')]
\log{\sum_{y'}\exp{\big(\mathbf{w}\cdot\mathbf{f}({x},{y}')\big)}}
3.2
1.6
0.8
0.4
0.2

Fruit

flies

like

a

banana

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Backward

\mathbb{E}_{p(y'|x)}[\mathbf{f}(x,y')]
\log{\sum_{y'}\exp{\big(\mathbf{w}\cdot\mathbf{f}({x},{y}')\big)}}
2.2
1.6
2.8
3.3
1.1

Fruit

flies

like

a

banana

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Backward

\mathbb{E}_{p(y'|x)}[\mathbf{f}(x,y')]
\log{\sum_{y'}\exp{\big(\mathbf{w}\cdot\mathbf{f}({x},{y}')\big)}}
8.9

Fruit

flies

like

a

banana

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Forward-Backward

N

V

D

A

P

\left[ \begin{array}{c} {{0.5}}\\ 0.0\\ 0.1\\ 0.2\\ 0.2 \end{array} \right]
\left[ \begin{array}{c} 0.2\\ 0.2\\ 0.1\\ {{0.4}}\\ 0.1 \end{array} \right]
\left[ \begin{array}{c} {{0.7}}\\ 0.1\\ 0.1\\ 0.1\\ 0.0 \end{array} \right]
\left[ \begin{array}{c} 0.1\\ 0.1\\ {{0.4}}\\ 0.2\\ 0.2 \end{array} \right]
\left[ \begin{array}{c} 0.1\\ {{0.4}}\\ 0.0\\ 0.2\\ 0.3 \end{array} \right]

Fruit

flies

like

a

banana

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Marginal Inference

\left[ \begin{array}{c} {{0.5}}\\ 0.0\\ 0.1\\ 0.2\\ 0.2 \end{array} \right]
\left[ \begin{array}{c} 0.1\\ 0.1\\ {{0.4}}\\ 0.2\\ 0.2 \end{array} \right]
\left[ \begin{array}{c} 0.1\\ {{0.4}}\\ 0.0\\ 0.2\\ 0.3 \end{array} \right]
\left[ \begin{array}{c} {{0.7}}\\ 0.1\\ 0.1\\ 0.1\\ 0.0 \end{array} \right]
\left[ \begin{array}{c} 0.2\\ 0.2\\ 0.1\\ {{0.4}}\\ 0.1 \end{array} \right]

N

V

D

A

P

Fruit

flies

like

a

banana

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Marginal Decoding

\left[ \begin{array}{c} 0.1\\ {\color{red}{0.4}}\\ 0.0\\ 0.2\\ 0.3 \end{array} \right]
\left[ \begin{array}{c} 0.2\\ 0.2\\ 0.1\\ {\color{red}{0.4}}\\ 0.1 \end{array} \right]
\left[ \begin{array}{c} {\color{red}{0.7}}\\ 0.1\\ 0.1\\ 0.1\\ 0.0 \end{array} \right]
\left[ \begin{array}{c} 0.1\\ 0.1\\ {\color{red}{0.4}}\\ 0.2\\ 0.2 \end{array} \right]
\left[ \begin{array}{c} {\color{red}{0.5}}\\ 0.0\\ 0.1\\ 0.2\\ 0.2 \end{array} \right]

N

V

D

A

P

Learning

\mathbf{w}^{(k)}
\mathbf{w}^{(k+1)}
L
L
U

CRF

\log\sum\exp

Margin

Fruit

flies

like

a

banana

Structured Perceptron

\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\max_y{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}
+
-
\sum_i\Big(
\Big)
\min_{\mathbf{w}}

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Fruit

flies

like

a

banana

Max-Margin

\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\Big)
\Big(
-
+
\sum_i
\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\max_y{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}
+
-
\sum_i\Big(
\Big)
\min_{\mathbf{w}}
\min_{\mathbf{w}}
\max_y{\big(\color{brown}{\Delta(y_i,y)}+\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}

A

N

V

D

N

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Fruit

flies

like

a

banana

Max-Margin

A

N

V

D

N

\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\Big)
\Big(
-
+
\sum_i
\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\max_y{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}
+
-
\sum_i\Big(
\Big)
\min_{\mathbf{w}}
\min_{\mathbf{w}}
\max_y{\big(\color{brown}{\Delta(y_i,y)}+\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}
\color{brown}{\Delta(\textbf{D},\textbf{P})=1, \Delta(\textbf{N},\textbf{N})=0, \Delta(\textbf{N},\textbf{V})=10}

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Fruit

flies

like

a

banana

Decode with Oracle

A

N

V

D

N

\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\Big)
\Big(
-
+
\sum_i
\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\max_y{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}
+
-
\sum_i\Big(
\Big)
\min_{\mathbf{w}}
\min_{\mathbf{w}}
\max_y{\big(\color{brown}{\Delta(y_i,y)}+\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

\color{brown}{\Delta(\mathbf{D},\mathbf{P})=1, \Delta(\mathbf{N},\mathbf{N})=0, \Delta(\mathbf{N},\mathbf{V})=10}

Fruit

flies

like

a

banana

Structural SVM

A

N

V

D

N

\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\Big)
\Big(
-
+
\sum_i
\min_{\mathbf{w}}
\max_y{\big({\Delta(y_i,y)}+\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Tsochantaridis, I., Joachims, T., Hofmann, T., & Altun, Y. (2005). Large margin methods for structured and interdependent output variables.

Fruit

flies

like

a

banana

"Soft"Max-Margin

A

N

V

D

N

\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\Big)
\Big(
-
+
\sum_i
\mathbf{w}\cdot\mathbf{f}({x}_i,{y}_i)
\log\sum_y\exp{\big(\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}
+
-
\sum_i\Big(
\Big)
\min_{\mathbf{w}}
\min_{\mathbf{w}}
\log\sum_y\exp{\big(\color{brown}{\Delta(y_i,y)}+\mathbf{w}\cdot\mathbf{f}({x}_i,{y})\big)}

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

N

V

D

A

P

Gimpel, K., & Smith, N. A. (2010). Softmax-margin CRFs: Training log-linear models with cost functions. In NAACL-HLT.​

Learning

\mathbf{w}^{(k)}
\max
\mathbf{w}^{(k+1)}
L
L
U

Structured Perceptron, Structural SVM

\Delta(y,y')

Learning

\mathbf{w}^{(k)}
\mathbf{w}^{(k+1)}
L
L
U

CRF, "Soft"Max-Margin CRF

\Delta(y,y')
\log\sum\exp

So Far

Decoding

  • Finding the optimal y for a given x and the given parameter w

 

Learning

  • Finding the optimal w with training examples

Next...

Variants of structured prediction models