SENTIMENT ANALYSIS OF TWEETS
Prateek Sharma (19)
Rajesh Pathak (42)
Sunny Kumar (43)
Overview
According to Wikipedia, Sentiment analysis (also known as opinion mining) refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective information in source materials.
- A major benefit of social media is that we can see the good and bad things people say about the particular brand or personality.
- The bigger your company gets difficult it becomes to keep a handle on how everyone feels about your brand. For large companies with thousands of daily mentions on social media, news sites and blogs, it’s extremely difficult to do this manually.
- To combat this problem, sentimental analysis software are necessary. These soft wares can be used to evaluate the people's sentiment about particular brand or personality.
Overview (..contd)
Proposed System
Data Source
Data Source (..contd)
In order to access Twitter Streaming API, we need to get 4 pieces of information from Twitter: API key, API secret, Access token and Access token secret. Follow the steps below to get all 4 elements:
- Create a twitter account if you do not already have one.
- Go to https://apps.twitter.com/ and log in with your twitter credentials.
- Click "Create New App"
- Fill out the form, agree to the terms, and click "Create your Twitter application"
- In the next page, click on "API keys" tab, and copy your "API key" and "API secret".
- Scroll down and click "Create my access token", and copy your "Access token" and "Access token secret".
API
- Streaming API
- - The Streaming APIs give developers low latency access to Twitter’s global stream of Tweet data.
- REST API
- - The REST APIs provide programmatic access to read and write Twitter data. Author a new Tweet, read author profile and follower data, and more.
Sentiment Analysis
Positive tweet
“Yeah!!! i bought a htc XONE+ stylish design, and great battery.”
Negative tweet
“watched @Troy, movie full of violence not my type, boo.. wastage of money don't watch”
Data Format (JSON)
Filter Raw Data
Data Format (..contd)
Emoticons
- happy- :) sad- :(
Feature Reduction
- @username
- URL
-
Repeated letters
- ( huuunnnggryyy, cutieeeeee, ..)
Machine Learning Tools
Naive Bayes, maximum entropy, and support vector machines
Approach
Visualization
Now that all the work is done. We can create visual histograms and other plots to visualize the sentiments of the user.
Visualization Using Maps
Tools
- D3.js
javascript library for creating data driven document.
- Matplot.lib
Python tools for vizualization
- Ggplot2
Visualization tool in R programming language
- API restriction
Non identical rates of processing the tweet by the server and our local machine due to which after certain time it restrict the app developer.
- Geo Location restriction
Difficult to find the tweet origin, thus challenging task is clustering same place tweets.
- Language restriction
People tweets in their native language eg: hindi, gujrati etc, impossible to identify sentiment as per current NLTK technology.
Challenges
- https://dev.twitter.com/
- tweepy.com
- apps.twitter.com
- nlp.stanford.edu/sentiment/
- opendata-tools.org/en/visualization/
- cs.stanford.edu/people/alecmgo/papers/Twitter....pdf
References
deck
By Prateek Sharma
deck
- 1,186