A Blagger's Guide To Data
"Data literacy spans a broad set of technologies, skills and processes. It is the ability to read, work with, analyse and argue with data."
Unstructured
Structured
JSON
Tweet
Lists
RDF
Spreadsheets
Relational SQL Databases
HTML pages
Text documents
Folders
noSQL Databases
Unstructured data
UNIX 1969
Perl
Unstructured data
Files and Folders
Apple Lisa 1983
Perl
1950s - Steven Cole Kleene
1979 - Visicalc Invented by Dan Bricklin
Spreadsheets
Why Spreadsheets?
Fluid and adaptable
Grid-like, structured
Organised
Calculations
Aggregations, totals, averages, tax
Spreadsheets in 1 min
Headers
Clean data!
Validation (Lookup)
Formulas
Pivot tables
Common Spreadsheet Errors
vs
Originally a card-based index system
Visualised and used
...becomes...
DATABASES TO THE RESCUE?
1960s -> 1970s -> 1980s - Relational Databases
Why Databases?
Server
Schemas
Constraints
Queries
ACID. Transactions
DRY TABLES!
SQL in 1974
Indexing
Databases in 1 min
Tables
Ids
Each row has a unique Id/key
Foreign keys
SELECT image, owner, date
from pictures
WHERE owner = 'Tom'
ORDER BY
date DESC
limit 20
Why Not Databases?
People often find them difficult?
The Schema is (deliberately) restrictive
Your data is very complex / unstructured
Your data WILL become more complex
The Olden Days
Every single letter used mattered.
WELCOME TO THE FUTURE OF DATA
What's new?
Not much tbh
Better, faster, more reliable
Work with more data
Visualisation
Better tools
Online
Collaborative
They all "know about each other"
Surely there's better news than that?
It's all in the cloud now
NoSQL databases
Big Data
NoCode tools are becoming "a thing"
APIs
Application Programming Interface
JSON or XML or TEXT or CSV
More data than ever
is available
Graph Databases
Professional vs Beginner
Visualisations
Google Data Studio
You can work with Google Data Studio too (if your data is clean)
Even more visualisation tools here
The Coding Elephant in the room
What area? Swiss-army knife vs statistical vs fun vs web vs speed vs £££ vs reliability vs specialised ?
Python, Javascript
Core data skills
- Organise and Design Concepts
- Re-shape and clean data
...then...
- Interrogate or integrate data (i.e use)
UML
Notation for databases, business processes, services etc
OpenRefine
Orange3
Thank you
Questions?
A Blagger's Guide To Data
By Tom Smith
A Blagger's Guide To Data
- 35