Open Data Kickoff

 

Maksim Pecherskiy

Performance & Analytics Department

Agenda

  • Introductions
  • Administrative/'Housekeeping' 
  • Overview of Open Data Program
  • What is Data?
  • Terminology
  • Roles and Responsibilities
  • Inventory
  • Support
  • Key Takeaways
  • Next Steps

Housekeeping

  • Please make sure you've signed in
  • Make sure you have a handout

Who Is This Guy?

  • Chief Data Officer, City of San Diego
  • Performance & Analytics Department
  • Lots of years doing software engineering
  • Worked in Puerto Rico through Code for America
  • Saw how powerful data can be when used inside government

I'm Sorry!

I'm terrible at designing slides

Why Are We Here?

  • Provide High Quality Public Service
  • Work in partnership with all our communities to achieve safe and livable neighborhoods.
  • Create and sustain a resilient, economically prosperous city.

 

Opening Data ties in directly with each of our Strategic Goals and allows us to monitor how we're doing at meeting them

Why Are You Here?

  • Information Coordinator.
  • Work with P&A and I to create data inventory.
  • Know department and business.
  • Collective, in-depth knowledge of city operations.
  • Know the people to go to for answers.
  • Represent your department's / divison's data profile.  
  • Not necessarily technical, understand how your data is used.
  • Your boss made you.

What do Information Coordinators do?

  • Inventorying department data sets
  • Establishing a plan and timeline for publishing them
  • Serving as a key point of accountability for timelines and questions about data sets
  • Advising on privacy, data licensing, metadata and other standards and practices
  • Providing quarterly reports on progress in implementing the open data plan
  • Initial bulk of workload during inventory, but working together to release data thereafter.

Working Together

  • Respect each other's time.
  • No useless meetings.
  • Support.
  • Addressing Concerns.
  • Accessibility.
  • Streamlining.
  • Partnership.
  • Mutual Learning.
  • Continous Feedback.
  • Meeting deadlines.
  • Push out of comfort zone.
  • Keeping you awake.
  • Please Ask Questions!

San Diego

Open Data Policy

  • Passed December 2014
  • Draws on other existing policies.

  • Defines terms, making sure data meets "open criteria"

  • Assigns responsibilities to Chief Data Officer and to City Departments

  • Sets timeline 

  • Includes reporting requirements to Mayor and Council

Upcoming Key Dates for Implementation

  • March 31, 2015 - Guidelines for Data Inventory
  • June 1, 2015 - Inventory to be completed by departments
  • July 1, 2015 - Technical Guidelines (To address release protocol and PII)
  • July 1, 2015 - Initial Written Status Report

Let's Talk About Data

State of Data, SD, 2015

State of Data, SD, 2015

  • Closed
  • Misunderstood
  • Hard to find
  • Unknown
  • Scattered
  • Uncentralized

State of Data, SD, 2015

  • PRA Nightmares
  • Misinterpretation
  • Slowdowns
  • Re-Work
  • Lack of Innovation
  • Frustration

And How It Feels to Work With It

  • Download PDF
  • Download Tabula
  • Download Java
  • Extract Page #5
  • Run Tabula
  • Select and Extract Table Data
  • Import into Excel
  • Fix headers
  • Paste into viz tool
  • Visualize

Total Time: 1 Hour

  • Accessible
  • Described
  • Reusable
  • Timely
  • Complete
  • Treated like asset
  • Centralized
  • Machine Readable

Where we can be

  • Efficient PRA
  • No Re-Work
  • Innovation
  • Empowerment
  • Efficiency
  • Engaged Citizens
  • Minimized Misinterpretation
  • Reliability
  • Data Usage for decisions
  • Cool apps!

We Can...

  • Empower people to build applications.
  • Empower consumers of those applications.
  • Allow city employees to be more efficient and innovative.
  • Allow taxpayers to benefit from a more efficient, nimble government
  • Build our own dashboards
  • Communicate with our citizens
  • Spend less time on PRAs
  • Do our own analysis and make data driven decisions
  • Give data back to their owner - the taxpayer!

Summing it Up

"Open data is not just about putting spreadsheets on the internet.

 

It means being deliberate and thoughtful about what is released, how it's released, and how it's described. 


Treating data like an asset and releasing it properly goes directly hand-in-hand with making sure that it's secure and mitigating opportunities for misuse"

We Can go From This

To This

San Diego

can be a leader in this space.

First Some Definitions

What is Data?

A value or set of values representing a specific concept or concepts.

 

Data become “information” when analyzed and possibly combined with other data in order to extract meaning and to provide context.

 

The meaning of data can vary depending on its context.

 

What is Data?

"Data is something you can take, and do something else with."

(Besides print it or send it in an e-mail).

What is Data?

"Data is something you can take, and do something else with."

(Besides print it or send it in an e-mail).

NOT Data

Data

Open Data

Open Data is

Data in an Open Machine Readable Format

  • CSV (not XLS)
  • ShapeFiles
  • GeoJSON
  • iCal
  • JSON
  • XML
  • API

Data

Not Beautiful

[Open] Data is Not

A Website

A Dashboard or Chart

 

A Map

A PDF / Word Doc or E-Mails

But It Can Be Visualized

Data

OR

OR

But It Can Be Visualized

Data

Data Visualization

Important

We MUST allow the capability to separate data from how it's shown in order for it to have value beyond what the visualization intended

Important

The user may not share your opinion of what and how they want to visualize the same data.

Important

"Data" can be separated from how it's displayed.

Machine Readable, Open Formats

  • Reasonable structured to allow automated processing
  • Open formats are non-proprietary, publicly available, and no restrictions should be placed upon their use.

Data Source

  • Technology or system that stores data, including databases, named spreadsheets, information systems, business applications, etc.

Dataset

Contents of a single database table, worksheet or defined view; data is provided as a single combination of unique rows (or records) and corresponding columns (or fields) describing that row

 

Example - Database: A database may contain several data tables - each data table constitutes a dataset. However, you could also create new datasets by combining data from different tables into a new table.

Anything used to build a table / chart / map

MetaData

Data that Describes Data

  • Value Ranges
  • Column Descriptions
  • Responsible Stewards
  • Related data
  • Etc

ETL

  • Extract, Transform, Load
  • Pull data from a database, change it around based on specification for release, upload it to a portal.

Open Data Portal

  • A central place on the internet for Open Data to Live

Roles / Responsibilities

Quiz - What Is Data?

  • Budget PDF (on our site)
  • A CSV file
  • A map embedded in our web site
  • An excel spreadsheet
  • A graph in a PPT presentation.
  • A dashboard

Quiz - What Is Open Data?

  • Budget PDF (on our site)
  • A CSV file
  • A map embedded in our web site
  • An excel spreadsheet
  • A report in a Word Document
  • A PDF with a table in it
  • An E-Mail
  • A graph in a PPT presentation.
  • A dashboard

The Inventory

  • Due June 1, 2015
  • Continous, Recurring Process
  • Need to get a bird's eye view of information we have
  • Will help people across departments find data faster
  • Took other cities as long as 8 months
  • We can beat them in 3

Not All Inventoried Datasets will be Released

The Inventory - 3 Main Steps

  • Identify Data Sources - April 1 (NOT an April Fool's JOKE!!!) 
  • Identify Data Sets - May 1
  • Complete Catalog - June 1

Not All Inventoried Datasets will be Released

Step 1

Identify Data Sources

Data Source

  • Technology or system that stores data, including databases, named spreadsheets, information systems, business applications, etc.

Identify Data Sources

  • Our data lives in many places
  • Information systems
  • Databases
  • Excel Spreadsheets
  • Access Databases
  • ESRI

Identify Data Sources

  • What databases does your department use
  • What information systems does your department use?
  • What applications capture information or are used in your business processes?
  • Are there some data sources kept in spreadsheets on your desktop?
  • Are there some data that you work on with other people stored on shared drives?
  • Are you already publishing information out on the web or in reports?  Where does that information come from?
  • Do you use Excel Spreadsheets or Access Database to hold any information?

Spreadsheets

NOT every single spreadsheet your department owns.

  • Periodically updated 
    • Daily
    • Monthly
    • Quarterly
    • Yearly
  • Used to run reports 
  • Used to update leadership

Spreadsheets

  • Some examples of spreadsheets we're not looking for:

    • Timesheets.  

    • Project plans / Gantt charts.

    • Personal tracking sheets.  

    • One-time documents developed for a specific project.

Spreadsheets

  • Some examples of what we are looking for:

    • List of fire hydrants.

    • Performance measures.

    • Potholes list.

    • Street sweeping locations and times. 

    • PRA request history. 

    • Workforce centers. 

    • Street names.

    • Libraries - locations and hours.

    • List of Lists (Dataset that contains information about other datasets)

    • Reference data (data and provides reference to another data set.)

       

Roll-Up Reports

Identify Data Sources

Helpful References

  • Application List
  • Online Services List

Step 2

Brainstorm Datasets

Not All Inventoried Datasets will be Released

Low

  • Single Division / Group / Person has a good handle on all the data that flows in and out of the department

 

  • Easy to get questions answered by walking the hall or making a quick visit to someone's office

Inventory Complexity

Medium

  • Multiple divisions produce / use data

  • Known group of people / handlful of individuals have a good handle on who produces what data and for what purpose

  • Even if department is large and you don't know all your colleagues, you can quickly id the right person to talk to about your department's data

  • Questions can be answered easily over e-mail / quick call or a quick walk around the building

Inventory Complexity

High

  • Nearly every division/group uses/maintains/reports on data and there is no readily defined group / set of individuals who have a good handle on all the data in the department.

  • Even if each division/group has an individual that understands their respective data, and there may still be blindspots to explore.

  • Your department may be spread over different buildings or many floors of a building and getting answers to questions may take many phone loops or emails to find the “right” person.

Inventory Complexity

Which one are You?

Dataset

Contents of a single database table, worksheet or defined view; data is provided as a single combination of unique rows (or records) and corresponding columns (or fields) describing that row

 

Example - Database: A database may contain several data tables - each data table constitutes a dataset. However, you could also create new datasets by combining data from different tables into a new table.

Anything used to build a table / chart / map

Identify Datasets

  • What data populates your monthly / quarterly reports

  • What departmental data is currently publicly available?

  • What data does your department use for internal performance and trend analysis

  • What information is published as a KPI in the budget or performance metric elsewhere

  • If you were to build a dashboard for your department, what would the metrics be and where would the data come from?

  • What data is reported to federal, state or local agencies

  • What data requests do you receive under the PRA act.

  • What data do other departments ask for, or you share with other departments now

  • What kinds of open data are similar agencies across the country publishing

Identify Datasets

Resources

  • data.cityofchicago.org

  • nycopendata.socrata.com

  • data.sfgov.org

Step 3

Complete Catalog

Complete Catalog

Complete Metadata as described in the provided file

Complete Catalog (MetaData)

Example, eh?

To Clarify

  • 1 Spreadsheet Per Department
    • Multiple Coordinators Work Together in Real Time
  • Spreadsheet will contain reminder of instructions

Not All Inventoried Datasets will be Released

What Happens After the Inventory?

  • Technical Guidelines
    • PII
    • Publishing Plan
    • Data Review 
  • Prioritization Plan
  • Review Inventory, prioritize

Not All Inventoried Datasets will be Released

Support

  • Open Hours / Book A Meeting
  • Templates / Tools
  • Resources and Trainings
  • Expect a lot of communication by e-mail
  • E-Mail me anytime

Next Steps

  • Create GMail / Google Drive Account by March 9
  • Submit it to me via the form 
  • Complete Data Source spreadsheet (I will send it to you after receiving your Google account).
  • Inventory of Data Sources due by March 23
  • Next meeting beginning of April
  • Look out for training invitations
  • If you have questions about how to fill out Data Source list, please let me know.

Key Takeaways

  • I know we're asking for a lot, but it's worth it.  
  • I will value your time and hope you will value mine.  
  • Data must be able to separated from the way it's visualized
  • Not all inventoried data will be made public.  The inventory is for us to see what we have.
  • Inventory deadlines are:
    • April 1 (Data Sources)
    • May 1 (Dataset brainstorm)
    • June 1 (Catalog)
  • There are no bad questions.  We will learn from each other and make this the best process possible.

If You Changed Your Mind...

  • If you feel like you don't have the knowledge / capacity / time to act as information coordinator
  • Assign someone else! I won't be mad!  But please let me know - maksimp@sandiego.gov
  • If there is no one else, welcome to the team, I'm here to help!

See this

presentation Online!

Looking Forward To Working Together!

 

maksimp@sandiego.gov

 

Let's Do

Something

AWESOME

Together!

Open Data Kickoff

By sdcdo

Open Data Kickoff

  • 2,168