London house prices: get rich or default trying

(Jonas, Leto, Javier, Mateusz, Naomi, TASSOS)

Data

Prices of all houses sold in five London districts: 1995 - 2014

Objective

  • Mapping neighbourhoods of desirability over time
  • Is the road network more predictive than a distance network?
  • How do other factors help predict prices (e.g., crime, schools, transport, winter workshop locations etc.

Spatial model

K-means

price (red = higher)

what are the nodes?

what are the edges?

Postcodes  (Street segments)

Shortest paths through roads

ROAD NETWORK

challenges

  • Houses are identified by postcode
  • Only the GPS centroid of the postcode is known
    • →Not mapped to street network
  • ~20,000 postcodes → 400 million possible pairs

Blue: Postcode centroids. Red: Imputed postcode position in street

map matching

Find if two postcodes are adjacent:

  1. Identify potential candidates: Find all postcodes within 100 meters of each other → 70,000 pairs (7,000 postcodes)
  2. Use OSM to trace the route between each pair of postcodes
  3. Discard routes if a subroute is contained → 20,000 pairs

Postcodes

Road networks

Route between postcodes

map matching

Road network (OSM)

Imputed postcode network

network-based model

hypothesis testing:

  • Does the street network provide more information than spatial information alone?
YES (we hope)

NEXT STEPS

Develop models

london_housing

By Javier GB

london_housing

  • 1,239