Sequence Processing Tasks
Recurrent Neural Networks
\( x_1\)
\( x_2\)
\( x_3\)
\( x_4\)
\( x_5\)
\( U \)
\( U \)
\( U \)
\( U \)
\( U \)
\( W \)
\( W \)
\( W \)
\( W \)
\( V \)
\( V \)
\( V \)
\( V \)
\( V \)
\( s_1 \)
\( s_2 \)
\( s_3 \)
\( s_4 \)
\( s_5 \)
\( \hat{y}_1\)
\( \hat{y}_2\)
\( \hat{y}_3\)
\( \hat{y}_4\)
\( \hat{y}_5\)
\( s_0\)
\( W \)
Find a cheap Chinese restaurant
\( s_i = RNN (s_{i-1},x_i)\)
VB DT JJ JJ NN
Recurrent Neural Networks
\( x_i\)
\( U \)
\( s_t \)
\( \hat{y}_i\)
\( s_{t-1}\)
\( W \)
\( V \in \mathbb{R}^{36 \times d}\)
HMM vs RNN
HMMs are simpler than RNN
HMMs have less parameters, hence require less data
HMM vs RNN
\( P(y_t | y_{t-1},...,y_2,y_1) = P(y_t | y_{t-1}) \)
\( P(w_6 |w_5,w_4,..,w_1)\)
\( x_1\)
\( x_2\)
\( x_3\)
\( x_4\)
\( x_5\)
\( U \)
\( U \)
\( U \)
\( U \)
\( U \)
\( W \)
\( W \)
\( W \)
\( W \)
\( V \)
\( s_1 \)
\( s_2 \)
\( s_3 \)
\( s_4 \)
\( s_5 \)
\( \hat{y}_5\)
\( s_0\)
\( W \)
Find me a cheap Chinese
HMM vs RNN
VB
DT
JJ
Find
a
cheap
\(P(x) = \sum_y P(x,y) \)
\( = \sum_y P(x|y)P(y)\)
\( = \sum_y \prod_tP(x_t|y_t)\prod_tP(y_t|y_{t-1})\)
HMM vs RNN
\( x_1\)
\( x_2\)
\( x_3\)
\( x_4\)
\( x_5\)
\( U \)
\( U \)
\( U \)
\( U \)
\( U \)
\( W \)
\( W \)
\( W \)
\( W \)
\( V \)
\( s_1 \)
\( s_2 \)
\( s_3 \)
\( s_4 \)
\( s_5 \)
\( \hat{y}_5\)
\( s_0\)
\( W \)
\( P(\hat{y}_5 |x_5,x_4,..,x_1)\)