The Data Journalism Taxonomy
Sarah Cohen
February 2015 / Columbia University
Statistics are people with the tears washed off
- Paul Brodeur
He uses statistics as a drunken man uses lamp-posts -- for support rather than illumination
- origin unknown, widely attributed to Andrew Lang, b. 1844
"An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem."
- famous statistician John Tukey
Data in service of story
Timing: Part of the reporting process.
Qualities: Original research never before attempted or achieved, usually using public records
Skills / schools
- Reporting to find and understand sources (Journalism)
- News judgment (journalism)
- Data / document acquisition and cleaning (anywhere: coding expected)
- Qualitative and quantitative analysis (social sciences and statistics; more recently data science; law)
- Exploratory visualizations (data science, statistics, design)
- Reporting out the data and anecdotes (journalism)
- Storifying and writing about the data / documents (journalism)
Journalism about data
Timing: Anytime data becomes available; usually not original reporting but based on others' original research
Qualities: Insightful comment and analysis of others' work. Engaging and informal writing. Little or no programming or original reporting.
Skills / schools
- Reporting to find the data * (journalism)
- News judgment (journalism)
- Visualization for exploration and publication ( coding or tool use; design)
- Writing fast (journalism)
- Reporting to explain the data (journalism, also sometimes to storify)
- Domain knowledge (social sciences, medical school, e.g., a science writer)
* In many instances, not original research
Data as journalism
(News Applications and Interactive Graphics)
Timing: After a story is reported and proven, used primarily in presentation and distribution.
Qualities: Presentation-quality design ; optimized for mobile and other platforms; capable of high traffic with performance.
Skills / schools
- In-depth programming ability (computer science or a dev shop)
- Scale and stability (computer science)
- News judgment (journalism)
- Acquiring large-scale, streaming data (?)
- Design (journalism, design, geography, art, HCI, computer science)
Research in journalism
- Applied statistics in news
- Document analysis
- Natural language processing
- Machine learning for data and documents