Research Software Sentiment Analyser
Raquel Alegre
Carl Wilson
Sinan Shi
David Perez-Suarez
Gary MacIndoe
Olivier Philippe
![](https://s3.amazonaws.com/media-p.slid.es/uploads/403772/images/2378536/Screen_Shot_2016-03-22_at_16.31.03.png)
Problem we are trying to address:
- Measure research software credit, both negative and positive.
Audience:
- Potential users of research software
- Authors of research software
Benefits:
- To make qualitative choices based on software credit
- To find out information about their user community
Twitter Client
Sentiment Analysis
Data Analysis and Visualization
![](https://s3.amazonaws.com/media-p.slid.es/uploads/403772/images/2380414/Twitter_logo_blue.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/403772/images/2378536/Screen_Shot_2016-03-22_at_16.31.03.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/403772/images/2378444/Sentiment_Chart.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/403772/images/2380431/mswb_0101.png)
Tweets.csv
author;date;text;geo;id
raquelalegre;2015-03-20;I really like this sof
oliph;2015-03-23;Downloaded this library th
shisinan;2015-03-25;Check out the latest fro
author;date;text;geo;id;sentiment_data
raquelalegre;[...], 103
oliph;[...], 45
shisinan;[...], 97
![](https://s3.amazonaws.com/media-p.slid.es/uploads/403772/images/2378405/Screen_Shot_2016-03-22_at_15.47.34.png)
Demo!
Twitter Client
Authors:
- Raquel Alegre
- Olivier Philippe
- Given a term, searches for global tweets about it, limited to 1000.
- Elaborates results into a dictionary of tweets.
- Saves results to a CSV file for the Sentiment Analyser.
- Test Driven Development
Technologies/Methodologies:
- Python
- Tweepy
- JSON
Sentiment Analysis
Authors:
- Gary Macindoe
- Carl Wilson
Technologies:
- Java 8
- Stanford NLP
- OpenCSV
- Maven
- Travis-CI
- Read input CSV file.
- Remove @usernames, #hashtags and URLs from tweet content using twitter-text library.
- Score tweet content using Stanford NLP library.
- Write output CSV file with score appended to each row.
![](https://s3.amazonaws.com/media-p.slid.es/uploads/403772/images/2382988/Screen_Shot_2016-03-23_at_15.00.34.png)
Website
Authors:
- David Perez-Suarez
- Sinan Shi
Technologies:
- Python
- Flask
- Pandas
- Bokeh
- Grab user input from website.
- Call Twitter Client to search for tweets about user input.
- Call Sentiment Analyser to classify tweets.
- Make and visualize plot and GitHub badge.
![](https://s3.amazonaws.com/media-p.slid.es/uploads/403772/images/2383103/Screenshot_from_2016-03-23_15-25-47.png)
-
Data mining from other sources
- getpapers, StackOverflow, ...
- Comparisons between research software tools
- Live metrics
- Geolocation
-
Improvement of the Sentiment Analysis:
- Add languages other than English
- Testing
- Bias
- tweets about one's software tools
- retweets - exponential impact
Future development:
Research Software Sentiment Analyser - Hack Day
By Raquel Alegre
Research Software Sentiment Analyser - Hack Day
Presentation by the winning team of the SSI Collaborations Workshop '16 about work carried on during the Hack Day
- 2,395