Using Sentiment Analysis of Twitter

to Develop Strategies for

Library Data Sharing Partnerships

#cni14f

Sara Mannheimer

@saramannheimer

 

Scott W. H. Young

@hei_scott

Outline

Open Data Movement

Sentiment Analysis and Twitter

Results and Application for Data Services

Research Limitations

Next Steps

Ethical Considerations

Open Data

 Sunlight is said to be the     best of disinfectants.


                                      - Justice Louis Brandeis,                                                   as quoted by Barack Obama

Open data policies + portals by state

“[This story] is proof that anyone can use Open Data to improve our neighborhoods and our great city, one small discovery at a time.”

                                                               - Ben Wellington, I Quant NY

Before Open Data

After Open Data

 

 

Over the next few years we have an astonishing opportunity to change and improve the way science is done.

                           - Michael Nielsen, Reinventing Discovery

Investigators are expected to share ... the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing.

Data Underlying Published Research Results Will Be Accessible and Open Immediately. The foundation will require that data underlying the published research results be immediately accessible and open. 

Joint Data Archiving Policy

 “Minimal Dataset:” The dataset used to reach   the conclusions drawn in the manuscript

Pilot Research Design

Sentiment Analysis

The extraction of positive or negative

opinions from (unstructured) text

                                                   

                                                        — Pang & Lee, 2008 

Sentiment Analysis + Data Sharing

Identify researchers in 6 disciplines

Isolate data-related tweets

Code sentiment

Pull data from Twitter API

:)

:(

:|

Positive Tweets

 

Paleontologists         47%

Economists                38%

Genomicists               26%

Chemists                    16%

Sociologists                 12%

Electrical Engineers   4%

Limitations & Biases

Social Media

(is not perfect)

Sampling Bias

Twitter Users are more likely to be

City Dwellers

Male

Younger

More Educated

African American

Not Scientists

Sentiment Analysis

(is not perfect)

:)    or    :(

Big Data

(is not perfect)

"Big" Data

Data at machine-speed and machine-scale,

with analysis through computation

“The ability to harvest the wealth of information contained in biomedical Big Data will advance our understanding of human health and disease.”

— NIH Big Data to Knowledge

“Numbers are seductive because they look like answers.”

Why is it important to investigate the ethics of social media (big) data collection and analysis?

“[The work] was consistent with Facebook’s Data Use Policy, to which all users agree prior to creating an account on Facebook, constituting informed consent for this research.”

— Authors of the Facebook Emotional Contagion Study

Consent

Informed

“This study is a scandal because it brought Facebook’s troubling practices into a realm—academia—where we still have standards of treating people with dignity and serving the common good.”

Ethical concerns for social media (big) data research

Copyright and Data Publication

Communication with Subjects

Transparency of Research

Self-Awareness 

Collaboration = scale

Gather clues to guide outreach

Ethical framework for social media analysis

Takeaways + Next Steps

Thank you!

Sara Mannheimer

@saramannheimer

 

Scott W.H. Young

@hei_scott

 

Jim Espeland, Software Engineer

 

Kyle Steiner, Research Assistant

Research Team 

Made with Slides.com