More robust data from crawler
DMOZ common word exclusion
Next steps
Song titles
Restaurants
Movies
Cities
TV Shows
Point: Categories spelled in non-English characters
Not included in list of common words
[removed from TopN list]
After removing non-English character categories: 568219
After removing repeated categories: 156160
...
...
Restaurant TopN file:
[Given N = 10]