(1) Join this Slack:

https://ledeprogram.slack.com/

join and star the following channels:

#data-ms-2023, #reporting-ii-2023

 

(2) Fill out the Polly survey

In the #reporting-ii-2023 Slack channel

 

If you have any questions, just raise your hand 🖐!

Welcome! Let's get rolling!

 

 

Hello,  my name is...

 

Dhrumil Mehta (he/him)

Associate Prof. of Journalism @ Columbia U.

Deputy Director of Tow Center

Visiting Associate Prof of Public Policy @ Harvard Kennedy School

 

dhrumil.mehta@columbia.edu

 @datadhrumil

@dmil

 

 

You will meet Prof. Denise Ajiri next week!

 

 

 

Today

- Meet Dhrumil

- Meet each other! (Introductions & Survey Responses)

- Syllabus Overview

 

- Data Journalism: Possibilities And Limitations

- Intro to Descriptive Statistics

 

- Homework Overview

 

Highlights:

Currently

  • Associate Prof. @ Columbia Graduate School of Journalism

  • Visiting Associate Prof. @ Harvard Kennedy School

 

Previously

  • Database Journalist, Politics @ FiveThirtyEight

  • Software Development Engineer @ Amazon
  • Northwestern:
    • BA in Philosophy + Minor in Cognitive Science
    • MS in Computer Science

Database Journalist, Politics

Data-Driven Storytelling

 

 

 

Data Scraping / Cleaning

Bots

Internal Workflows

 

Bots

Lets readers see results that FiveThirtyEight deems unexpected

Expectations are calibrated before results ever start coming in.

Open Data

https://www.datajournalismawards.org/project-listing/?project_id=2082

Quantitative Editing

Research

Computationally analyzing text to better understand media and political environments.

I have a research interest in text analysis

Let me tell you what kind of editor I am...

 

- what I'm good at

- what I'm working to improve

That's me!
But who are you?

Survey Responses

 

Live Coding

(unless I get nervous or we are short on time)

https://github.com/dmil/reporting-ii/blob/main/pre-class-survey/survey-responses.ipynb
 

Pay special attention to:

- [ ] The questions I ask of the dataset

- [ ] What I do when I don't know some code or forget how to do something

- [ ] What statistical or visual treatments I chose to apply and why

Now your turn!!!

Take a second to write down (digitally) a bit about who you are! (bullet points...they don't have to be legible to anyone but yourself)

  • What motivates you to study data journalism?
     
  • What are your journalistic interests as we start thinking about forming project groups...
    • topics you'd like to work on
    • skills you bring to a group project
    • skills you'd like to build / pick up from your group-mates

But there's a catch!

You have 5 minutes to create a 5-questions to get to know one other person in the room who you will have to introduce to the class

 

 

You will be split into:

1) Survey Makers

2) Interview Takers

 

Survey Makers (5 question survey)

Multiple Choice

Which of the following reporting topics interests you most? (1) Healthcare (2) Education (3) Agriculture
 

Scale

On a scale of 1-5, with 5 being most sure. How sure are you about what your thesis project will be about?

Yes/No

Are you an interested in New York City issues?

1-phrase Answer

Where did you grow up?

 

 

 

Interivew Takers

 

5 open-ended interview questions

Step 1: Send your questions to your partner via Slack

 

 

 

 

 

 

 

Step 2: Answer the questions that were sent to you via Slack

Tell us a story about your partner:

  • What motivates them to study data journalism...
     
  • What are their journalistic interests as we start thinking about forming project groups...beats you'd like to work on, skills you'd like to learn etc...

What have we learned?

Intro to Stats for Journalists

Summary Stats

Mean

 

 

 

Ben Orlin — Math with bad drawings

Ben Orlin — Math with bad drawings

Weighted Average

 

 

Text

Example

Median

Ben Orlin — Math with bad drawings

Ben Orlin — Math with bad drawings

Mode

Examples

 

 

 

Ben Orlin — Math with bad drawings

Ben Orlin — Math with bad drawings

Range

Ben Orlin — Math with bad drawings

Ben Orlin — Math with bad drawings

Correlation

https://www.investopedia.com/terms/n/negative-correlation.asp

Correlation

Pearson's Correlation Coefficient

Correlation doesn't imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing 'look over there'.

https://m.xkcd.com/552/

Ben Orlin — Math with bad drawings

Ben Orlin — Math with bad drawings

Correlation & Causation

 

Standard Deviation

 

 

 

 

 

 

Variance

 

 

 

 

 

 

 

Examples

 

 

  

Examples

 

 

  

Ben Orlin — Math with bad drawings

Ben Orlin — Math with bad drawings

Mystery Data

https://docs.google.com/spreadsheets/d/1ObVYCOeTgGK_n9rhVFG05I-VIjzriO-F6mCDIpdd_lo/edit#gid=0​

 

What can you tell about these 4 mystery datasets with summary statistics?

 

Calculate:

- Mean

- Median

- Mode

- Correlation

- Variance

 

Anascombe's Quartet

Datasauraus Dozen

Distributions

https://fivethirtyeight.com/features/al-gores-new-movie-exposes-the-big-flaw-in-online-movie-ratings/

Normal Distribution

Pay Attention To Distribution Of Data

Plotting

Exploratory Data Visualization

Pearson correlation is ????

Pearson correlation is 0.9909

because there are 40 duplicate data pts in top right and bottom left corner

Editorial Choices

 

Homework Review

Dataset?      or a

I have a dataset I'm interested in

Question?

I have a journalistic question that I'm interested in trying to answer

Does your pitch start with a:

What questions will I ask?

Where will I get the data?

Types of Data Stories

Dataset?      or a

I have a dataset I'm interested in

Question?

I have a journalistic question that I'm interested in trying to answer

Does your pitch start with a:

What questions will I ask?

Where will I get the data?

Counting Stuff

Answer a question with data

Support/Oppose a hypothesis

Identify a Phenomenon

Identify a Phenomenon

Debunk or Justify Conventional Wisdom

Data-Driven Profile

Lack of Data

Data driven investigative work

Dig for Data

Provide relevant context

Build our own dataset

  • With Code / Scrapers
  • By Hand
  • By Survey Tool

Archiving Data

Explain Calculations

Use Innovative Methodology

Use data to inform traditional reporting

The Rare Datapoint

Challenging Official Data

Huge Data Dump

  • Uber
  • Election Results
  • Census

Reporting II 2022 - Day 1

By Dhrumil Mehta

Reporting II 2022 - Day 1

Saying hello to students

  • 311