Exploratory Data Analysis with

Maria Schreiber

Grace Hopper Program

April 20, 2017

 

What do you think of when you hear 'big data'?

'Big Data'?

We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run.

 

- Roy Amara

 

 

 

Gartner Hype Cycle

How is 'big data' analyzed?

  • Formulating questions/hypotheses
  • Data collection/identification
  • Data wrangling/munging
  • Exploratory data analysis
  • Formal modeling/prediction/analysis
  • Drawing and communicating conclusions

Exploratory Data Analysis

EDA happens between you and the data and isn't about proving anything to anyone else yet.    

 

 

- Cathy O'Neil

Here's where R comes in!

Wait, what is R?

  • Open source statistical programming language
  • Initially released in 1995, but is gaining in popularity
  • Interpreted language - can be used in the CLI, but there are also many nice GUIs
  • Enhanced by user-written packages

Why R ?

And are there other options?

  • Yes, it's another programming language, but it's easy to learn and use (at least on a basic level).
  • It includes both statistical overviews and visualizations all in one handy package!
  • But there are also many other options for visualizations.

 

But we're here for R!

Live code time..

SHOW AT LEAST ONE MORE VIZ HERE.

Playse.

Resources

Udacity EDA course/Data visualization course

R-bloggers - https://www.r-bloggers.com/

R Cookbook - http://www.cookbook-r.com/

Ggplot 2 - http://ggplot2.tidyverse.org/index.html

NYC Open Data - https://opendata.cityofnewyork.us/

 

CHECK OUT MY UPCOMING BLOGPOST!

Thank you!

EDA with R

By mschreiber

EDA with R

EDA with R

  • 744