Mapping Disasters in South - East Asia using Twitter
DEVANSHI VERMA
ABOUT ME
- Graduated from NSIT, Delhi University, Class of 2018 with B.E. in Instrumentation and Control Engineering.
- Published a research paper in an International Journal on how to detect rumors on Twitter using supervised Machine Learning
- Interned at Geoinformatics Center in Thailand to work on geospatial data science.
- Currently, working as an analyst at EXL Analytics.
AGENDA
-
Why Twitter Data?
-
How to get the Data?
-
Problem and Solution
-
Framework
-
Python Libraries and Code
-
Output
-
Testing
-
Questions?
WHY TWITTER DATA ?
- Twitter is the top source of breaking news averaged 335 million users in the first quarter of 2018 with an average of around 6,000 tweets per second.
- It provides fast, real-time information about a large-scale disaster and can produce a map within around a minute of messages being posted.
HOW TO GET TWITTER DATA?
TWITTER API
- API is a way to request and deliver information.
- Twitter APIs that return Tweets provide that data encoded using JSON which is based on key-value pairs, with named attributes and associated values.
- 2 types of API's
- Search API
- Stream API
PROBLEM?
- With a tweet, we have 4 types of Data dictionaries: Tweet object, User object, Twitter entities and extended entities.
- In Tweet object dictionary we have the coordinates in geoJSON format
- PROBLEM: A user has to enable the precise location to add this information. This feature is OFF by default!
SOLUTION
NAMED ENTITY RECOGNITION
SOLUTION
NAMED ENTITY RECOGNITION
FRAMEWORK
FETCH TWEETS
#Authorised access with the API
auth=tweepy.OAuthHandler(CONSUMER_KEY,CONSUMER_SECRET)
auth.set_access_token(OAUTH_TOKEN,OAUTH_TOKEN_SECRET)
api=tweepy.API(auth)
#extracting the tweets
keyword='#Flood'
public_tweets = api.search(q=keyword,lang='en',count=500,monitor_rate_limit=True)
#filtering out the tweets for asian countries
asian_counties=['cambodia','india','indonesia','malaysia','nepal','philippines','singapore','srilanka',
'thailand','vietnam','myanmar','bangladesh','japan','china','bhutan','korea','australia',
'taiwan','kazakhstan','pakistan','cook islands','fiji','vanuatu','kiribati','micronesia',
'nauru','niue','samoa','solomon',' tonga','tuvalu','andaman','nicobar','papua']
for tweet in public_tweets:
for i in asian_counties:
if i in tweet.text.lower():
listoftweets.append(tweet_text)
time_stamp.append(tweet.created_at)
#storing the data in a dataframe
df=pd.DataFrame(columns=['Text','Time_stamp'])
df['Text']=listoftweets
df['Time_stamp']=time_stamp
FETCH TWEETS
EXTRACT GPE,LOC AND FAC
EXTRACT GPE,LOC AND FAC
USE GEOCODER
#extracting all co-ordinates
lat=[]
long=[]
time_stamp=[]
names=[]
for i in range(len(dict1)):
g=geocoder.google(list(dict1.keys())[i])
if(g.latlng is not None and g.latlng[0] != 35.86166):
lat.append(g.latlng[0])
long.append(g.latlng[1])
time_stamp.append(list(dict1.values())[i])
names.append(list(dict1.keys())[i])
PLOT ON MAP
import folium
from folium.plugins import MarkerCluster
t=folium.Map(location=[11.88,124],zoom_start=4)
marker_cluster = MarkerCluster().add_to(t)
for i in range(len(lat)):
folium.Marker([lat[i],long[i]],popup='<b>Flood : %s<br> Created on: %s</b>'%(names[i],
time_stamp[i]),icon=folium.Icon(color='blue',icon='info-sign')).add_to(marker_cluster)
t.add_child(folium.LatLngPopup())
folium.TileLayer('Mapbox Control Room').add_to(t)
folium.LayerControl().add_to(t)
t.save('Final_Time_Map.html')
OUTPUT
TESTING
FLOODS
No | Disaster | Location | Status | Source |
---|---|---|---|---|
1 | Flood | Japan - Kamo River, Hiroshima, Kyoto, Fukuoka, Okayama, Moyotama, Mabi town, Kurashiki, Nagasaki,Kyusyu island | Detected | Floodlist |
2 | Flood | Pakistan - Lahore | Detected | Floodlist |
3 | Flood | Nepal | Not Detected | Floodlist |
4 | Flood | India - Jammu, and Kashmir, Karimganj, Srinagar | Detected | Floodlist |
TESTING
EARTHQUAKES
No | Disaster | Location | Status | Source |
---|---|---|---|---|
1 | Earthquake | Japan - Chiba, Tokyo, Fukushima | Detected | USGS |
2 | Earthquake | Indonesia - Sumatra | Detected | USGS |
3 | Earthquake | Japan - Osaka | False Detected | ---------- |
4 | Earthquake | India - Rajasthan | Detected | USGS |
5 | Earthquake | Australia - Adelaide SA | Detected | USGS |
6 | Earthquake | Taiwan- Taitung County | Detected | USGS |
TESTING
LANDSLIDES
No | Disaster | Location | Status | Source |
---|---|---|---|---|
1 | Landslide | India - Tamenglong district, Manipur, Jammu, Baltal Route, Jammu and Kashmir | Detected | |
2 | Landslide | Japan - Hiroshima, Kurashiki, Kyushu | Detected | |
3 | Landslide | China - Beichuan Qiang Autonomous County | Detected |
QUESTIONS?
thisisdevanshi
PyData
By Devanshi Verma
PyData
- 745