PHC7065 CRITICAL SKILLS IN DATA MANIPULATION FOR POPULATION SCIENCE
Spatial Data
Hui Hu Ph.D.
Department of Epidemiology
College of Public Health and Health Professions & College of Medicine
March 19, 2018
Introduction to Spatial Data
Lab: Spatial Data
Introduction to Spatial Data
Introduction to Spatial Data
Data Models
A geographic data model is a structure for organizing geospatial data so that it can be easily stored and retrieved.
Geographic coordinates
Tabular attributes
Spatial Data Models
Vector Model
- points, lines, polygons
Raster Model
- exhaustive regular or irregular partitioning of space
Points
Lines
Shapefiles
.shp - the file that stores the geometry of the feature
.shx - the file that stores the index of the feature geometry
.dbf - the dBASE file that stores the attribute information
.prj - the file that defines the shapefile's projection
.html, .htm, .xml - the files that usually contains metadata
.sbn and .sbx - store additional indices
Coordinate Systems and Projections
3D sphere
Geographic Coordinate System
2D flat
Projected Coordiate System
Geographic Coordinate Systems
- Longitude and latitude
- Units: Degrees (DMS or DD)
Shape of the Earth
- Surface: The Earth's real surface
- Ellipsoid: Ideal, smooth surface
- Geoid: Bumpy surface, where gravity is equal for all locations
Datum
- Defines the position of the spheroid relative to the center of the earth.
- Global datum:
- uses the earth's center of mass as the origin
- Local datum:
- aligns its spheroid to closely fit the earth's surface in a particular area
- a point on the surface of the spheroid is matched to a particular position on the surface of the earth
- the coordinate system origin of a local datum is not at the center of the earth
Datum
Common Local Datum: North American Datum (NAD)
Common Global Datum: World Geodetic System (WGS)
Projected Coordinate Systems
- A projected coordinate system is defined on a flat, two-dimensional surface
- Unlike a geographic coordinate system, a projected coordinate system has constant lengths, angles, and areas across the two dimensions
- A projected coordinate system is always based on a geographic coordinate system
The systematic rendering of a graticule on a flat map surface
Distortion
Converting a sphere to a flat surface results in distortion
-
Shape (conformal) - If a map preserves shape, then feature outlines (like county boundaries) look the same on the map as they do on the earth.
- Lambert Conformal Conic
- UTM -
Area (equal-area) - If a map preserves area, then the size of a feature on a map is the same relative to its size on the earth.
- Alerts Equal Area Conic - Distance (equidistant) - An equidistant map is one that preserves true scale for all straight lines passing through a single, specified point. If a line from a to b on a map is the same distance that it is on the earth, then the map line has true scale. No map has true scale everywhere.
- Direction/Azimuth (azimuthal) – An azimuthal projection is one that preserves direction for all straight lines passing through a single, specified point.
Universal Transverse Mercator Coordinate System
- World divided into 60 six-degree-wide zones
- From 80S to 84N
- Zones numbered 1-60 (N&S), W to E, starting at 180W
Differences between Projections
SRID
- Spatial Reference IDentifier
- It defines all the parameters of our data’s geographic coordinate system and projection.
- An SRID is convenient because it packs all the information about a map projection (which can be quite complex) into a single number.
-
http://spatialreference.org/ref/epsg/26918/
- What if you do not know the SRID?
- upload the .prj file and get the SRID here: http://prj2epsg.org/search
Lab: Spatial Data
Data
nyc_census_blocks
blkid A 15-digit code that uniquely identifies every census block. Eg: 360050001009000
popn_total Total number of people in the census block
popn_white Number of people self-identifying as “White” in the block
popn_black Number of people self-identifying as “Black” in the block
popn_nativ Number of people self-identifying as “Native American” in the block
popn_asian Number of people self-identifying as “Asian” in the block
popn_other Number of people self-identifying with other categories in the block
boroname Name of the New York borough. Manhattan, The Bronx, Brooklyn, Staten Island, Queens
geom Polygon boundary of the block
Number of records: 36592
Data
nyc_neighborhoods
name Name of the neighborhood
boroname Name of the New York borough. Manhattan, The Bronx, Brooklyn, Staten Island, Queens
geom Polygon boundary of the neighborhood
Number of records: 129
New York has a rich history of neighborhood names and extent
Data
nyc_streets
name Name of the street
oneway Is the street one-way? “yes” = yes, “” = no
type Road type (primary, secondary, residential, motorway)
geom Linear centerline of the street
Number of records: 19091
Data
nyc_subway_stations
name Name of the station
borough Name of the New York borough. Manhattan, The Bronx, Brooklyn, Staten Island, Queens
routes Subway lines that run through this station
transfers Lines you can transfer to via this station
express Stations where express trains stop, “express” = yes, “” = no
geom Point location of the station
Number of records: 491
git pull
PHC7065-Spring2018-Lecture8
By Hui Hu
PHC7065-Spring2018-Lecture8
Slides for Lecture 8, Spring 2018, PHC7065 Critical Skills in Data Manipulation for Population Science
- 610