PHC7065 CRITICAL SKILLS IN DATA MANIPULATION FOR POPULATION SCIENCE
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/3351362/pasted-from-clipboard.png)
Spatial Data
Hui Hu Ph.D.
Department of Epidemiology
College of Public Health and Health Professions & College of Medicine
March 18, 2019
Introduction to Spatial Data
Lab: Spatial Data
Introduction to Spatial Data
Introduction to Spatial Data
Data Models
A geographic data model is a structure for organizing geospatial data so that it can be easily stored and retrieved.
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/3108846/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/3108850/pasted-from-clipboard.png)
Geographic coordinates
Tabular attributes
Spatial Data Models
Vector Model
- points, lines, polygons
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/3108862/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/3108864/pasted-from-clipboard.png)
Raster Model
- exhaustive regular or irregular partitioning of space
Points
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/3108908/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/3108909/pasted-from-clipboard.png)
Lines
Common Formats
- Well-known binary (WKB) and well-known text (WKT)
- the most common formats for spatial objects
- Keyhole Markup Language (KML)
- an XML-based format, used by Google
- SRS is always SRID 4326
- Geography Markup Language (GML)
- an XML-based format used in Web Feature Service
- Geometry JaveScript Object Notation (GeoJSON)
- a format based on JSON
- Scalable Vector Graphics (SVG)
- popular among high-end rendering or drawing tools
- Extensible 3D Graphics (X3D)
Shapefiles
.shp - the file that stores the geometry of the feature
.shx - the file that stores the index of the feature geometry
.dbf - the dBASE file that stores the attribute information
.prj - the file that defines the shapefile's projection
.html, .htm, .xml - the files that usually contains metadata
.sbn and .sbx - store additional indices
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/3108901/pasted-from-clipboard.png)
Coordinate Systems and Projections
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/3108916/pasted-from-clipboard.png)
3D sphere
Geographic Coordinate System
2D flat
Projected Coordiate System
Geographic Coordinate Systems
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/3108938/pasted-from-clipboard.png)
- Longitude and latitude
- Units: Degrees (DMS or DD)
Shape of the Earth
- Surface: The Earth's real surface
- Ellipsoid: Ideal, smooth surface
- Geoid: Bumpy surface, where gravity is equal for all locations
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/3108945/pasted-from-clipboard.png)
Shape of the earth (cont'd)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/3108945/pasted-from-clipboard.png)
- Gauss determined in the early 19th century that the surface of the earth can be defined using gravitational measurements
- geoid: where gravity is equal for all locations
-
Geoid is far from spherical
- the core of the earth is not homogenous
- mass is distributed unevenly
- Geoid is the foundation of both planar and
geodetic models
Ellipsoid
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/3108945/pasted-from-clipboard.png)
- Simplifications of the geoid which are generally good enough for most geographic modeling needs
- An ellipsoid is merely a 3D ellipse
- Instead of one ellipsoid to rule us all, people on different continents wanted their own ellipsoids to better reflect the regional curvature of the earth
- Today, the world is settling on the World Geodetic System (WGS 84) and Geodetic Reference System (GRS 80) ellipsoids
- WGS 84 is the standard of choice, and is what all GPS systems are based on
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/4505706/pasted-from-clipboard.png)
Common ellipsoids and their ellipsoidal parameters
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/4505716/pasted-from-clipboard.png)
- Lon/lat with different ellipsoid are not the same
- they use different grounding points
- it's important to not just call things lon/lat: you can have NAD27 lon/lat, NAD80 lon/lat, etc. Each will be subtly different
Datum
- Ellipsoid only models the overall shape of the earth
- after picking out an ellipsoid, you need to anchor it to use it for real-world navigation
- even if two reference systems use the same ellipsoid, they can still have different anchors, or datum, on earth
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/3109003/pasted-from-clipboard.png)
- Defines the position of the spheroid relative to the center of the earth.
- Global datum:
- uses the earth's center of mass as the origin
- Local datum:
- aligns its spheroid to closely fit the earth's surface in a particular area
- a point on the surface of the spheroid is matched to a particular position on the surface of the earth
- the coordinate system origin of a local datum is not at the center of the earth
Coordinate Reference System
- A coordinate reference system is only one necessary ingredient that goes into the making of an SRS and isn't SRS itself
- used to identify a point on your reference ellipsoid
- Most popular coordinate reference system for use is the geographical coordinate system
- also known as geodetic coordinate system or simply lon/lat
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/3108938/pasted-from-clipboard.png)
- Longitude and latitude
- Units: Degrees (DMS or DD)
Projection
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/3109022/pasted-from-clipboard.png)
Taking an ellipsoidal earth and squashing it onto a flat surface
- Projection has distortion built in
- because geodetic and 3D globes are ellipsoidal, they by definition do not refer to a flat surface
- Why do we need to have 2D projections?
- the mathematical and visual simplicity that comes with planar (Euclidean) geometry
Distortion
- How exactly you squash an ellipsoidal earth on a flat surface depends on what you are trying to optimize for
- In creating a projection, we try to balance four conflicting features:
- measurement
- shape: how accurately does it represent angles
- direction: is north really north
- range of area supported
- E.g. if you want to span a large area, you have to either give up measurement accuracy or deal with the pain of maintaining multiple SRSs and some mechanisms to shift among them
Projection Types
Cylindrical projections
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/4505831/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/4505833/pasted-from-clipboard.png)
Conic projections
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/4505835/pasted-from-clipboard.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/4505836/pasted-from-clipboard.png)
Azimuthal projections
Orientation of the paper roll around the globe
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/4505885/pasted-from-clipboard.png)
Main classes of planar coordinate systems
- Lambert Azimuthal Equal Area (LAEA)
- good for measurement and can cover large areas, but not great for shape
- US National Atlas (EPSG:2163)
- Lambert Conformal Conic (LCC)
- preserve shape more than area, good for measurement for the regions they serve, and distort poles
- best used for middle latitudes with east-west orientation
- Universal Trans Mercator (UTM)
- good for measurement, shape, and direction, but only span six-degree longitudinal strips, cannot be used for the polar regions
- Mercator
- good for preserve shape and direction, and spanning the globe, but not good for measurement
- common favorites for web map display since we only need to maintain one SRID
- National grid systems
- variant of UTM or LAEA, but are used to define a restricted region, such as a country
- State plane
- US spatial reference systems, usually designed for a specific state
- most are derived from UTM
Universal Transverse Mercator Coordinate System
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/3109051/pasted-from-clipboard.png)
- World divided into 60 six-degree-wide zones
- From 80S to 84N
- Zones numbered 1-60 (N&S), W to E, starting at 180W
Differences between projections
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/3109064/pasted-from-clipboard.png)
Spatial Reference System
- SRS is the production of geodetics and cartography
- geodetics: the science of measuring and modeling the earth
- cartography: the science of representing the earth on flat maps
- Why do we need SRS?
- to bring in data from multiple sources and be able to overlay one atop another
- Many standards of SRS:
- most common one is the European Petroleum Survey Group (EPSG) numbering system
- take any two sources of data with the same EPSG number, and they will overlay perfectly
SRID
- Spatial Reference IDentifier
- It defines all the parameters of our data’s geographic coordinate system and projection.
- An SRID is convenient because it packs all the information about a map projection (which can be quite complex) into a single number.
-
http://spatialreference.org/ref/epsg/4326/
-
EPSG is a very recent SRS numbering system
- If you are using data from a few decades ago, you won't find EPSG number
-
The constituent pieces that form an SRS:
- ellipsoid
- datum
- projection
What spatial reference system is appropriate?
![](https://s3.amazonaws.com/media-p.slid.es/uploads/367769/images/4505935/pasted-from-clipboard.png)
- Excellent: covers the globe
- Good: covers a large country like the US; the measurements for the area served are usually within a meter for length, area, and distance calculations
- Medium: covers several degrees or a large state; measurements are accurate within meters, but can be as much as 10 meters off
- Bad: measurements don't have useful units
Lab: Spatial Data
git pull
PHC7065-Spring2019-Lecture7
By Hui Hu
PHC7065-Spring2019-Lecture7
Slides for Lecture 7, Spring 2019, PHC7065 Critical Skills in Data Manipulation for Population Science
- 1,005