Research Software Sentiment Analyser

Raquel Alegre

Carl Wilson

Sinan Shi

David Perez-Suarez

Gary MacIndoe

Olivier Philippe

 

 

Problem we are trying to address:

  • Measure research software credit, both negative and positive.

Audience:

  • Potential users of research software
  • Authors of research software

 

 

Benefits:

  • To make qualitative choices based on software credit
  • To find out information about their user community

Twitter Client 

Sentiment Analysis

 

Data Analysis and Visualization

Tweets.csv

author;date;text;geo;id

raquelalegre;2015-03-20;I really like this sof

oliph;2015-03-23;Downloaded this library th

shisinan;2015-03-25;Check out the latest fro

author;date;text;geo;id;sentiment_data

raquelalegre;[...], 103

oliph;[...], 45

shisinan;[...], 97

Demo!

Twitter Client

Authors:

  • Raquel Alegre
  • Olivier Philippe
  1. Given a term, searches for global tweets about it, limited to 1000.
  2. Elaborates results into a dictionary of tweets.
  3. Saves results to a CSV file for the Sentiment Analyser.
  • Test Driven Development

Technologies/Methodologies:

  • Python
  • Tweepy
  • JSON

Sentiment Analysis

Authors:

  • Gary Macindoe
  • Carl Wilson

Technologies:

  • Java 8
  • Stanford NLP
  • OpenCSV
  • Maven
  • Travis-CI
  1. Read input CSV file.
  2. Remove @usernames, #hashtags and URLs from tweet content using twitter-text library.
  3. Score tweet content using Stanford NLP library.
  4. Write output CSV file with score appended to each row.

Website

Authors:

  • David Perez-Suarez
  • Sinan Shi

Technologies:

  • Python
  • Flask
  • Pandas
  • Bokeh
  1. Grab user input from website.
  2. Call Twitter Client to search for tweets about user input.
  3. Call Sentiment Analyser to classify tweets.
  4. Make and visualize plot and GitHub badge.

 

  • Data mining from other sources
    • getpapers, StackOverflow, ...
  • Comparisons between research software tools
  • Live metrics
  • Geolocation
  • Improvement of the Sentiment Analysis:
    • Add languages other than English
    • Testing 
    • Bias
      • tweets about one's software tools
      • retweets - exponential impact

Future development:

Made with Slides.com