The Data Journalism Taxonomy

Sarah Cohen
February 2015 / Columbia University










Statistics are people with the tears washed off
- Paul Brodeur

He uses statistics as a drunken man uses lamp-posts -- for support rather than illumination
- origin unknown, widely attributed to Andrew Lang, b. 1844

"An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem."
- famous statistician John Tukey

Data in service of story

Timing: Part of the reporting process. 
Qualities: Original research never before attempted or achieved, usually using public records

Skills / schools

  • Reporting to find and understand sources (Journalism)
  • News judgment (journalism)
  • Data / document acquisition and cleaning (anywhere: coding expected)
  • Qualitative and quantitative analysis (social sciences and statistics; more recently data science; law)
  • Exploratory visualizations (data science, statistics, design)
  • Reporting out the data and anecdotes (journalism) 
  • Storifying and writing about the data / documents (journalism)

Journalism about data








Timing: Anytime data becomes available; usually not original reporting but based on others' original research
Qualities: Insightful comment and analysis of others' work. Engaging and informal writing. Little or no programming or original reporting.

Skills / schools



  • Reporting to find the data * (journalism)
  • News judgment (journalism)
  • Visualization for exploration and publication ( coding or tool use; design)
  • Writing fast (journalism)
  • Reporting to explain the data (journalism, also sometimes to storify)
  • Domain knowledge (social sciences, medical school, e.g., a science writer)



 * In many instances, not original research

Data as journalism

(News Applications and Interactive Graphics)







Timing: After a story is reported and proven, used primarily in presentation and distribution.
Qualities: Presentation-quality design ; optimized for mobile and other platforms; capable of high traffic with performance.

Skills / schools


  • In-depth programming ability (computer science or a dev shop)
  • Scale and stability (computer science)
  • News judgment (journalism)
  • Acquiring large-scale, streaming data (?) 
  • Design (journalism, design, geography, art, HCI, computer science)

Research in journalism


  • Applied statistics in news
  • Document analysis 
  • Natural language processing
  • Machine learning for data and documents

Taxonomy of Data Journalism

By Sarah Cohen

Taxonomy of Data Journalism

  • 1,829