Prateek Sharma (19)
Rajesh Pathak (42)
Sunny Kumar (43)
Overview
According to Wikipedia, Sentiment analysis (also known as opinion mining) refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective information in source materials.
Overview (..contd)
Proposed System
Data Source
Data Source (..contd)
In order to access Twitter Streaming API, we need to get 4 pieces of information from Twitter: API key, API secret, Access token and Access token secret. Follow the steps below to get all 4 elements:
API
Sentiment Analysis
Positive tweet
“Yeah!!! i bought a htc XONE+ stylish design, and great battery.”
Negative tweet
“watched @Troy, movie full of violence not my type, boo.. wastage of money don't watch”
Data Format (JSON)
Filter Raw Data
Data Format (..contd)
Emoticons
Feature Reduction
Machine Learning Tools
Naive Bayes, maximum entropy, and support vector machines
Approach
Visualization
Now that all the work is done. We can create visual histograms and other plots to visualize the sentiments of the user.
Visualization Using Maps
Tools
javascript library for creating data driven document.
Python tools for vizualization
Visualization tool in R programming language
Non identical rates of processing the tweet by the server and our local machine due to which after certain time it restrict the app developer.
Difficult to find the tweet origin, thus challenging task is clustering same place tweets.
People tweets in their native language eg: hindi, gujrati etc, impossible to identify sentiment as per current NLTK technology.
Challenges
References