Research Software Sentiment Analyser

Raquel Alegre

Carl Wilson

Sinan Shi

David Perez-Suarez

Gary MacIndoe

Olivier Philippe



Problem we are trying to address:

  • Measure research software credit, both negative and positive.


  • Potential users of research software
  • Authors of research software




  • To make qualitative choices based on software credit
  • To find out information about their user community

Twitter Client 

Sentiment Analysis


Data Analysis and Visualization



raquelalegre;2015-03-20;I really like this sof

oliph;2015-03-23;Downloaded this library th

shisinan;2015-03-25;Check out the latest fro


raquelalegre;[...], 103

oliph;[...], 45

shisinan;[...], 97


Twitter Client


  • Raquel Alegre
  • Olivier Philippe
  1. Given a term, searches for global tweets about it, limited to 1000.
  2. Elaborates results into a dictionary of tweets.
  3. Saves results to a CSV file for the Sentiment Analyser.
  • Test Driven Development


  • Python
  • Tweepy
  • JSON

Sentiment Analysis


  • Gary Macindoe
  • Carl Wilson


  • Java 8
  • Stanford NLP
  • OpenCSV
  • Maven
  • Travis-CI
  1. Read input CSV file.
  2. Remove @usernames, #hashtags and URLs from tweet content using twitter-text library.
  3. Score tweet content using Stanford NLP library.
  4. Write output CSV file with score appended to each row.



  • David Perez-Suarez
  • Sinan Shi


  • Python
  • Flask
  • Pandas
  • Bokeh
  1. Grab user input from website.
  2. Call Twitter Client to search for tweets about user input.
  3. Call Sentiment Analyser to classify tweets.
  4. Make and visualize plot and GitHub badge.


  • Data mining from other sources
    • getpapers, StackOverflow, ...
  • Comparisons between research software tools
  • Live metrics
  • Geolocation
  • Improvement of the Sentiment Analysis:
    • Add languages other than English
    • Testing 
    • Bias
      • tweets about one's software tools
      • retweets - exponential impact

Future development:

Research Software Sentiment Analyser - Hack Day

Presentation by the winning team of the SSI Collaborations Workshop '16 about work carried on during the Hack Day

