Using Sentiment Analysis of Twitter
to Develop Strategies for
Library Data Sharing Partnerships
#cni14f
Sara Mannheimer
@saramannheimer
Scott W. H. Young
@hei_scott
Outline
Open Data Movement
Sentiment Analysis and Twitter
Results and Application for Data Services
Research Limitations
Next Steps
Ethical Considerations
Open Data
“Sunlight is said to be the best of disinfectants.”
- Justice Louis Brandeis, as quoted by Barack Obama
Open data policies + portals by state
“[This story] is proof that anyone can use Open Data to improve our neighborhoods and our great city, one small discovery at a time.”
- Ben Wellington, I Quant NY
Before Open Data
After Open Data
“Over the next few years we have an astonishing opportunity to change and improve the way science is done.”
- Michael Nielsen, Reinventing Discovery
Investigators are expected to share ... the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing.
Data Underlying Published Research Results Will Be Accessible and Open Immediately. The foundation will require that data underlying the published research results be immediately accessible and open.
Joint Data Archiving Policy
“Minimal Dataset:” The dataset used to reach the conclusions drawn in the manuscript
Pilot Research Design
Sentiment Analysis
The extraction of positive or negative
opinions from (unstructured) text
— Pang & Lee, 2008
Sentiment Analysis + Data Sharing
Identify researchers in 6 disciplines
Isolate data-related tweets
Code sentiment
Pull data from Twitter API
:)
:(
:|
Positive Tweets
Paleontologists 47%
Economists 38%
Genomicists 26%
Chemists 16%
Sociologists 12%
Electrical Engineers 4%
Limitations & Biases
Social Media
(is not perfect)
Sampling Bias
Twitter Users are more likely to be
City Dwellers
Male
Younger
More Educated
African American
Not Scientists
Sentiment Analysis
(is not perfect)
:) or :(
Big Data
(is not perfect)
"Big" Data
Data at machine-speed and machine-scale,
with analysis through computation
“The ability to harvest the wealth of information contained in biomedical Big Data will advance our understanding of human health and disease.”
— NIH Big Data to Knowledge
“Numbers are seductive because they look like answers.”
Why is it important to investigate the ethics of social media (big) data collection and analysis?
“[The work] was consistent with Facebook’s Data Use Policy, to which all users agree prior to creating an account on Facebook, constituting informed consent for this research.”
— Authors of the Facebook Emotional Contagion Study
Consent
Informed
“This study is a scandal because it brought Facebook’s troubling practices into a realm—academia—where we still have standards of treating people with dignity and serving the common good.”
Ethical concerns for social media (big) data research
Copyright and Data Publication
Communication with Subjects
Transparency of Research
Self-Awareness
Collaboration = scale
Gather clues to guide outreach
Ethical framework for social media analysis
Takeaways + Next Steps
Thank you!
Sara Mannheimer
@saramannheimer
Scott W.H. Young
@hei_scott
Jim Espeland, Software Engineer
Kyle Steiner, Research Assistant
Research Team
Using Sentiment Analysis of Twitter to Develop Strategies for Library Data Sharing Partnerships
By Scott W. H. Young
Using Sentiment Analysis of Twitter to Develop Strategies for Library Data Sharing Partnerships
- 3,417