Hadoop

on Windows

Publication tone (author confidence) detection

research results are described by the authors as outstanding, average, or “good-enough”

Initial Dataset

Process with Pig

Process with MapReduce

Naive sentiments 

Results of processing 

Results of processing

 - better view color coded -

Pain points

over 9000 ...

  • compile hadoop
  • install ant
  • install cygdrive
  • compile pig
  • hadoop config files
  • hadoop yarn ?
  • no jobhistory on windows
  • add jar dependency into hadoop
  • hadoop 1 vs hadoop 2 hdfs vs dfs
  • no more hard disk space
  • etc.

Hadoop

By Stefan Hagiu