"The quick brown fox..."
"the quick brown fox..."
["the", "quick", "brown", ...]
[3, 6732, 1199, ...]
???
✨
negative
raw input
cleaning
tokenization
word representation
model
output
"the quick brown fox..." → ["the", "quick", "brown", "fox",...]
"the" → [1, 0, 0, 0, 0, ...]
"quick" → [0, 1, 0, 0, 0, ...]
this is the approach used in Naive Bayes
what are we actually asking our model to learn?
this feels like we're asking our model to do quite a lot
"the quick brown fox..." → ["the", "quick", "brown", "fox",...]
"the" → [-0.13, 1.67, 3.96, -2.22, -0.01, ...]
"quick" → [3.23, 1.89, -2.66, 0.12, -3.01, ...]
intuition: close vectors represent similar words
two common approaches to training
GloVe: co-occurance matrix, good fast results at both small and huge scales
w2v: NN with masking, very fast to train, nice python support
both result in something close to a Hilbert space
(The only scary slide)