Slot Induction
陳家陞 / 羅凱齡
- Goal
- Ref'rence
- Model
Goal
Pre-previous
Slots defined by domain experts, requiring human labor.
Previous
Use Semafor to extract possible slot candidate, and re-rank afterwards.
Now (The Goal)
A modular unsupervised slot induction model. Use the re-rank method from Previous.
Reference
MUSE
- An unsupervised approach to learn the sense embedding given raw text.
- Separate sense selection and embedding learning into two modules, linked by probability acted as reward.
GLoVe
- A well-known unsupervised approach to learn and explain the word embedding given raw text.
- Based on counting word pair frequency, mechanic preventing over-estimate on extreme (high and low) word frequency.
Model
Modules
- Module Selection
- Module Embedding
Module Selection
RL-agent trained to pick the correct slot for word. The reward comes from Module Embedding.
Module Embedding
Slot Embedding
The module learn the slot embedding iteratively, contributing to more precise selection.
Module Embedding
Desired properties
- Slot embedding close to (in vector space) words embedding related to it.
- Intra-slot relation similar to related intra-word relation.
- Slot embedding not dominant by very-high-frequency word.
Module Embedding
Reward
Supposed Wb is in Wa's content. And
Reward =
Not necessarily CosSim
\mathcal{M}_{SEL}(W_i) = S_i, i \in \{a,b\}
MSEL(Wi)=Si,i∈{a,b}
CosSim(W_a-W_b, S_a - S_b)
CosSim(Wa−Wb,Sa−Sb)
Module Embedding
Update slot embedding -- Naïve
For Wa and Wb in content,
S_a = S_a + \alpha (W_a - S_a), 0 < \alpha \leq 1
Sa=Sa+α(Wa−Sa),0<α≤1
S_b = S_b + \alpha (W_b- S_b), 0 < \alpha \leq 1
Sb=Sb+α(Wb−Sb),0<α≤1
Module Embedding
Update slot embedding -- Naïve (Cont'd)
(Neg sampling) For Wa and Wb not in content,
S_a = S_a - \alpha (W_a - S_a), 0 < \alpha \leq 1
Sa=Sa−α(Wa−Sa),0<α≤1
S_b = S_b - \alpha (W_b- S_b), 0 < \alpha \leq 1
Sb=Sb−α(Wb−Sb),0<α≤1
Need experiment here
Module Embedding
Update slot embedding -- GLoVe-like
GLoVe: Optimize
J = f(X_{ij})(w_i^T\tilde{w_j} - \log{X_{ij}})^2
J=f(Xij)(wiTwj~−logXij)2
We: Optimize
J = f(X_{ij})(s_i^T\tilde{w_j} - \log{X_{ij}})^2
J=f(Xij)(siTwj~−logXij)2
Fix all word embedding, only move slot
Module Embedding
Update slot embedding -- GLoVe-like
What does that mean?
「泰式」,「餐廳」has high correlation.
-> 「類型」,「餐廳」should have high correlation.
w_i =「泰式」, s_i = 「類型」
Slot Induction
By qitar888
Slot Induction
- 576