[Civic] Data Engineering

 

Building a system to track Milwaukee evictions.

 

Branden DuPont

Medical College of Wisconsin

 

 Datashare @ MCW IHE = Local IDS

 

 

 

 

Eviction as Cause, Not Just Consequence of Poverty

  • Displacement
  • Affects education outcomes for children
  • Housing conditions (worsened by cycle of eviction)

  • Among many others

Milwaukee Policymakers, Researchers, & Community Members Need Eviction Data

  • Difficult to understand drivers of eviction and design prevention strategies
  • Community unable to view a properties eviction record
  • Conducting additional research prohibitive

Consider a Basic Question on Eviction

  • "He'll evict you in a minute" - Milwaukee Journal Sentinel on Youssef "Joe" Berrada's Use of Eviction
  • High eviction filing rate or does Berrada just own a large number of properties?

Data Needed To Construct an Eviction Filing Rate

  • Court records on eviction - small claims case
  • Address the eviction took place
  • Taxkey information for that address
  • The number of units at that property
  • The primary land use of the property (multi-family or single-two family)

ETL: A Common Pattern

Dave Guarino, ETL for America

  • Extract: getting data out of [many] systems where it is stored

  • Transform: reformatting and reshaping the data in ways that make it usable. Creating meaningful variables (eviction filing rate)

  • Load: putting the transformed data into another system, generally something where analyzing it or combining it with other data is easy for end-users

ETL: a hard ^#&@ing problem

Dave Guarino, ETL for America

  • "Many of the problems government confronts with technology are fundamentally about data integration"
  • Or, there is a reason no one was tracking evictions before

Foundational Work Comes First

Source:Monica Rogati’s  “The AI Hierarchy of Needs”

Getting Eviction Data

Got Eviction Data:

Deeply Nested JSON

  • Data needs to be parsed into 8 tables
  • Re-pull entire case for any slight change
  • Retention rules: data needs to be stored and deduplicated

Get Eviction Location

  • defendant in cases earliest address is used to represent an eviction location
  • check to see if an address is updated during a case. Extract address from court events
  • Remove PO boxes or out of country addresses if other defendant address is available

Match an Eviction Record to an Address

  • Defendant address formatted exactly as in city address
  • Extensive series of steps to clean addresses

Match Eviction Address To Taxkey

  • 94.8% match rate compared to 93% match for Eviction Lab

Grab City Property Data

  • Bring in data on primary land use, unit size, and ownership
  • Easy and painless thanks to the Open Data Portal

Calculate Eviction Filing Rate

  • How many evictions filed compared to the numbers of units
  • To get a long term trend, calculate an average over three year period
  • Visualize Berrada's properties.

Get Eviction Filing Rate

Imagine:

Repeating Every Single Step Again Next Week

  • Desmond's Eviction Lab ~ 3 years old
  • Difficult step of pulling, cleaning, and performing record-linkage

Track Milwaukee Evictions

  • Civic Data Engineering = automates this complexity
  • Provide useful information to the public, policymakers, and other researchers

Data Day 2019-MCW

By Branden DuPont

Data Day 2019-MCW

  • 38
Loading comments...

More from Branden DuPont