Recap from last week
Without any Exclusion List
With Exclusion List (Categories - DMOZ)
With Exclusion List (~1000 Homonyms)
With Exclusion List (~100 000 Most Common Words)
With Exclusion List (~10 000 Most Common Words)
Current Status:
(much larger data set: 5 categories, 1000 seeds each)
Current Status:
(much larger data set: 5 categories, 1000 seeds each)
for the next step: Neural Networks
Each vector corresponds to a seed in the seed file
...
...
Solution:
create multiple trees, not a single tree
Get all of Adult's children