Basic Data Tools for Journalism
what is data journalism?
=
charts, graphics, maps
process of obtaining, understanding, processing, analysing and presenting data
00100011100011011110001101100001100111100101001101001111001101010100001100111110010101110101010010111010101011110100001011010101110101010110111011101110101011011010101110110101011010110100000101010101011111101010101011011111010001010101101101101010101010101010101010011011011010101010111010101011100101010101111101001000101000101001001010101101010111010101000010101111010101011001011011010101010110111010101000101000001010101110101101011010101000010101010110111010110101010000010101010100100101010110111110101010100000101010101011010100000101010111110101001010101010101010100101011010101010101011010110111101001010101010101000001010100000000101010100110111010101101010000
user friendly data
xls
xlsx
csv
tsv
sps
json
xml
kml
geojson
shp
geotiff
pdf
jpeg
-
search
-
download
-
conversion
-
storage and preparation
-
analysis
-
presentation
-
What is useful/usable/free today, may not be useful/usable/free tomorrow
-
Criminal organizations and state agencies are constantly learning
-
This is only a collection, there are other tools which can be used for the same purposes
-
There is no Swiss knife-like tool, and this is good!
-
Many times you have to use different tools for every little step
-
Google is your friend, and you are not alone - there are hundreds of forums where experts discuss the solutions for specific questions and problems
some remarks
search for data
-
Google advanced search, Google search operators (filetype: / ext: )
-
DuckDuckGo
-
Facebook Graph Search / Facebook ID
-
Whois https://whois.domaintools.com/
-
-
Databases
-
Google Dataset Search
-
OCCRP Aleph
-
OCCRP ID
-
Vesselfinder
-
Flightradar24
-
Radarbox
-
Local databases (public procurement datasets, company registry database)
processed data
META | GLOBAL | SPECIAL | MARKET RESEARCH | NATIONAL |
---|---|---|---|---|
Google Data | UN | Global Terrorism DB | Kantar | Public Procurements |
Ourworldindata | Worldbank | Freedom House | Gemius | Central Statistical Bureau |
Kaggle | Eurostat | RSF | Ipsos | MET |
Eur Data Portal | UNESCO | TI | TNS | Tax Bureau |
Earth Engine Data Catalog |
IMF | Company Reg | Land ownership | |
NASA | OCCRP ALEPH | National funds | ||
OECD |
Facebook Ad Library |
|||
Google Looker Studio |
manual data collection
“The government announced yesterday that it will build six new stadiums for the next football World Cup. The first, 55,000 seats stadium is to be built in Nowhereham at a cost of €300 million, and the government has contracted Trust Us Ltd. to build it. Construction of the stadium is expected next summer.”
manual data collection
who?
what?
what size?
when?
where?
how much?
with whom?
by when?
manual data collection
manual data collection
date | year, H, Q, month, days, hours to seconds |
geolocation | continent, country, region, county, city, lines, polygons, coordinates |
string | description - names, qualities, categories |
measure | whole, fraction, decimal |
boolean | true/false, yes/no |
consistency, accuracy, standards
exact location
wrap up
- Use longform data table instead of wideform. It's much easier to add new data and entries, and you can fit all the data you need in one table.
- Make sure the data is clean and consistent. It only takes one misprint to get something wrong.
- Pay attention to language settings and spelling differences! Preferably use neutral or very common date, currency and number formats and use them consistently.
- Be aware that your database may be used by your colleagues or journalists from other countries. What is obvious to you may not be obvious to others.
download
-
Google Spreadsheet Importhtml=
-
Google Scraper
-
Import.io
conversion
storage and preparation
-
Excel
-
Numbers
-
Openrefine
-
Tableau Prep
analysis
-
Excel
-
Numbers
-
Mysql
-
Tableau Public
-
Power BI
-
IBM Watson
presentation
-
Flourish, Datawrapper, RAWGraphs, PlotDB, Plotly, Highcharts
-
Foursquare.io, QGIS, Mapbox / Google/Google Earth/Timelapse https://earthengine.google.com/timelapse/
-
Aleph VIS, Gephi, Cytoscape, Cosmograph
-
Chrono
-
Knightlab Tools https://knightlab.northwestern.edu/projects/
understanding data
- why the data is available?
- why the data is not available? lack of/missing data = information!
- methodology
- meaning
data
information
knowledge
insight
wisdom
conspiracy
inspiration
basicdatatools
By Attila Bátorfy
basicdatatools
- 538