Extracting Least Biased Tweets under a Hashtag using various Machine Learning Techniques
#what
To present relatively unbiased posts with respect to an identified twitter hashtag, using Machine Learning Techniques.
keywords
#filter #tweet #hashtag #ML #twitter #bias #socialmedia #neutrality
#why
The extremism in thoughts of a person during their swings of mood often gets reflected in posts they make, either intentionally or unintentionally.
When people begin to emotionally invest on a particular issue, they tend to have a biased opinion on it.
While this may seem harmless at first, the more influential the person is, the more effect will his opinion have.
How does Social Media influence our opinions?
Every social platform possesses learning algorithms that filter out content based on your previous choices, making it unable for you to view more generalised content with an open view.
This is commonly known as
"The Filter Bubble"
So, every time you do this..
your preferences in your filter change.
How does Twitter do it?
"In order to predict whether a particular Tweet would be engaging to you, our models consider characteristics (or features) of:
- The Tweet itself: its recency, presence of media cards (image or video), total interactions (e.g. number of Retweets or likes).
- The Tweet's author: your past interactions with this author, the strength of your connection to them, the origin of your relationship.
- You: Tweets you found engaging in the past, how often and how heavily you use Twitter.
Our list of considered features and their varied interactions keeps growing, informing our models of ever more nuanced behaviour patterns."
A squirrel dying in front of your house may be more relevant to your interests right now than people dying in Africa.
But we do not need relevance.
What we need is a system that provides us with the freedom to freely choose between the different emotional views over a particular issue, that is pure and unfiltered.
This will aid us in getting closer to the truth and facts than ever before. It can even help in understanding the bias in News.
#who
The Stakeholders
The stakeholders for this solution, if implemented, will include:
- Twitter Users who need a global outlook on any particular crisis.
- Citizen Journalists who need to know the different sides of the same story.
- Businesses can track their reputation of their online presence in Twitter while understanding the customer's mindset.
Citizen Journalism
This application will most definitely play an important factor in aiding citizen journalists to disseminate the actual facts, while helping them in discerning the truth among biased opinions.
Businesses
Small Businesses that have an online presence often struggle in maintaining the fanbase, due to their inability to comprehend the preferences of the majority of its users.
In such cases, this application would be a boon to such small businesses.
#how
Step 1
Search for the tag to filter
The twitter API's POST data provides the hashtags, the number of retweets, favourites and even the users mentioned in a particular post.
This process of searching, if automated, can be made to select a tag from the list of trending tags showcased by Twitter.
Step 2.1
Measure the closeness
between tags
Select all the posts that have the requested tag. Based on the tags that is commonly used after this tag, build the closeness graph of all the related tags.
This will help in identifying the tweets of various users that might have some context in common.
Step 2.2
Possible Methods to use for Network Analysis
For building the closeness measure, different network analysis methods can be used. They are
- Idea Adoption
- Information Reliability
- Knowledge Aquisition
The idea adoption intensity is a measure that is quantifies the influence of two users in a network. Mathematically, it can be written as
Using this, we could convert the problem into a Maximum Likelihood estimation problem, where the influence can also be tag-modulated.
Step 3
Perform Sentiment Analysis
Over this cluster of tags, average over the sentiments computed from the POST text, and perform closeness-weighted reduction to get the mean neutrality for a tag cluster.
Hence, the more close they are, the higher is the effect of the sentiments derived from the post under that tag.
Step 4
Weighted Sentiments for Tags
Perform the following computation for all the tweets from all the users, filtered by their tag in the tag cluster.
Obtain the weighted sentiment for the tag cluster and sort them according to your requirements. If sorted based on neutral sentiments, we should get unbiased results.
There might be a lot of crucial details that may not be taken into consideration. The method of approach for the model may also might not be sufficient
Hence, looking forward for your valuable suggestions and inputs for this project, that I wish to pursue.
#future
The Future and Beyond
The possible enhancements and features that can be incorporated into the proposed model in the future are
Different new models and its varieties can be explored
- Knowledge Based Learning Models
- Bayesian Inference Models
- In the future, this can even be served as Cloud based API Service.
#beyondTheBubble
A project proposal by
Arvind Srinivasan
Presented to
Prof. Tan Wee Kek
Research Project
By Arvind Srinivasan
Research Project
- 91