PHC6194 SPATIAL EPIDEMIOLOGY
Geocoding
Hui Hu Ph.D.
Department of Epidemiology
College of Public Health and Health Professions & College of Medicine
February 5, 2020
Geocoding
Lab: PostGIS TIGER Geocoder
Online Geocoding Service
Lab: Google Map Geocoding API
Geocoding
Geocoding
- The process of transforming a description of a location—such as a pair of coordinates, an address, or a name of a place—to a location on the earth's surface
- An important step in spatial epidemiology
Address
2004 Mowry Road, 4th Floor, Gainesville, FL 32610
Street Number, Street Name, Street Type, Unit Number, City, Zip code
Source of Reference Map
- Topologically Integrated Geographic Encoding and Referencing (TIGER)
- developed by US Census Bureau
- free, nation-wide data
- TIGER include key features of geographic interest in the entire US:
- political boundaries
- lakes
- reservations
- major and minor roads, rivers, etc.
Enabling TIGER Geocoder within PostGIS
sudo -u postgres psql -c "CREATE EXTENSION fuzzystrmatch; CREATE EXTENSION postgis_tiger_geocoder;" phc6194spr18db
- Relies on string matching
- to find streets with similar spellings
- fuzzy-string-match extension
sudo -u postgres psql -c "GRANT USAGE ON SCHEMA tiger TO PUBLIC; GRANT USAGE ON SCHEMA tiger_data TO PUBLIC;
GRANT SELECT, REFERENCES, TRIGGER ON ALL TABLES IN SCHEMA tiger TO PUBLIC; GRANT SELECT, REFERENCES, TRIGGER ON ALL TABLES IN SCHEMA tiger_data TO PUBLIC;
GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA tiger TO PUBLIC; ALTER DEFAULT PRIVILEGES IN SCHEMA tiger_data
GRANT SELECT, REFERENCES ON TABLES TO PUBLIC;" phc6194spr18db
- Grant permissions to TIGER
Loading TIGER data
- PostGIS has several built-in functions that will generate scripts to download, decompress, and load TIGER data to database directly
- Steps:
- create a folder to store all the TIGER data
- generate scripts to download, decompress, and load data
- execute the scripts
Normalizing Addresses
- Address standardization / normalization:
- A preparatory step before geocoding is to parse the address into components such as street numbers, directional prefixes, street numbers, suffixes, etc.
- Normalizers:
- normalize_address
- pagc_normalize_address
- postal address geocoder
Geocoding
- Geocoding using address text:
- uses normalize_address function to normalize the address by deafult
- can also switch to pagc_normalize_address
- Geocoding using normalized addresses
- Batch geocoding
Lab: PostGIS TIGER Geocoder
git pull
Online Geocoding Service
Application Program Interface
-
Application programming interface (API) is a set of subroutine definitions, protocols, and tools for building application software
- In general terms, it's a set of clearly defined methods of communication between various software components
- Common web service technologies:
- SOAP - Simple Object Access Protocol
- REST - Representational State Transfer
Google Geocoding API
Security and Rate Limiting
- The data provided by these APIs is usually valuable
- The data providers might
- limit the number of requests per day,
- or demand an API "key",
- or charge for usage
Lab: Google Map Geocoding API
Increase the limit here: e.g. 1000
PHC6194-Spring2020-Lecture5
By Hui Hu
PHC6194-Spring2020-Lecture5
Slides for Lecture 5, Spring 2020, PHC6194 Spatial Epidemiology
- 793