Introduction to Data Wrangling with Jupyter Notebooks

Upkar Lidder
http://upkarlidder.com/talks/

http://bit.ly/01-22-2020-ibm

IBM Developer

https://slides.com/upkar/jupyter

> ulidder@us.ibm.com
> @lidderupk

http://bit.ly/jupyter-pandas-git

WIFI - HD_Event

Password - 6j1n5q

Prerequisites

@lidderupk
IBM Developer

1. Create IBM Cloud Account using THIS URL

3. If you already have an account, use the above URL to sign into your IBM Cloud account.

2. Check your email and activate your account. Once activated, log back into your IBM Cloud account using the link above.

http://bit.ly/01-22-2020-ibm

@lidderupk
IBM Developer

Data Science Lifecycle

Workshop - Goals

@lidderupk
IBM Developer
Get acquainted with Pandas and Jupyter Notebooks on the cloud and analyze a movies dataset!

Steps

@lidderupk
IBM Developer
  1. Sign up / Log into IBM Cloud -  http://bit.ly/01-22-2020-ibm
  2. Create a Watson Studio Service
  3. Create a new project
  4. Import the sample notebook to your project
  5. RUN the cells and explore data!

Step 1 - sign up/ log into IBM Cloud

@lidderupk
IBM Developer

http://bit.ly/01-22-2020-ibm

Step 2 - locate Watson Studio in the catalog

@lidderupk
IBM Developer

Step 3 - create new watson studio service

@lidderupk
IBM Developer
@lidderupk
IBM Developer

Step 4 - launch Watson Studio

@lidderupk
IBM Developer

Step 5 - create new project and pick empty template

@lidderupk
IBM Developer

Step 6a - name your project and create storage service

@lidderupk
IBM Developer

Step 6b - add storage opens a new page

@lidderupk
IBM Developer

Step 6c - you will be taken back to the first page after storage

@lidderupk
IBM Developer

Step 7 - add notebook feature to your project

@lidderupk
IBM Developer

Step 8 - import notebook, get link from github

https://raw.githubusercontent.com/lidderupk/lidderupk-ibmdevelopersf-jupyternotebooks/master/asset/spectra-pandas.ipynb

@lidderupk
IBM Developer

Step 9 - Let's look at data now!

https://raw.githubusercontent.com/lidderupk/lidderupk-ibmdevelopersf-jupyternotebooks/master/asset/spectra-pandas.ipynb

@lidderupk
IBM Developer

Some links for the workshop

IBM Cloud account - http://bit.ly/01-22-2020-ibm

 

Jupyter Notebook - https://github.com/lidderupk/lidderupk-ibmdevelopersf-jupyternotebooks/blob/master/asset/spectra-pandas.ipynb

 

Datasets

Casts -  https://ibm.box.com/shared/static/nwvh08c06wtk48rsunk05qi9cruzkx4l.csv

Release Dates - https://ibm.box.com/shared/static/bm5kazwww0deze2a9r05l3mcqz5fgyvt.csv

Titles - https://ibm.box.com/shared/static/bvzl6da3t5vdav9xev4ype6sh78jhq7b.csv

Film Locations - https://ibm.box.com/shared/static/1kqel8vcya6r9wgxj4bvcsblp5xmyqsj.csv

 

Workshop Github - https://github.com/lidderupk/lidderupk-ibmdevelopersf-jupyternotebooks

@lidderupk
IBM Developer

Some links to get data

US Government Open Data - https://www.data.gov/

 

San Francisco Open Data - https://datasf.org/opendata/

 

IBM Data Asset eXchange - https://developer.ibm.com/exchanges/data/

 

Kaggle Datasets - https://www.kaggle.com/datasets

 

Google Datasets - https://cloud.google.com/public-datasets/

 

Curated on Github - https://github.com/awesomedata/awesome-public-datasets

Ryan Anderson Blog - https://dreamtolearn.com/ryan/1001_datasets

IBM Developer
@lidderupk
IBM Developer
@lidderupk
IBM Developer
@lidderupk

Thank you

 

Let's chat !

@lidderupk
IBM Developer

Upkar Lidder, IBM

 

@lidderupk

http://upkarlidder.com/talks/

ulidder@us.ibm.com

Made with Slides.com