Machines Reading Maps

"This is unreasonably cool, and so visually pleasing, it nearly tickles. It also feels like I’m looking at one of those moments where everything changes."

Machines Reading Maps Summit April 20-21, 2023

  • Invite-only Stakeholder Introduction to MRM
     
  • Public Introduction to MRM
     
  • Invite-only Community-building discussion
     
  • MRM/Recogito Workshop

What is the goal?

The goal of MRM is to:

  • Create a generalizeable ML pipeline that uses human collaboration to:
    • Process printed text on scanned maps

    • Enrich the printed text

    • Convert the printed text to structured data

  • Make scanned historical map content easily searchable through complex queries

How does MRM work?

Image Cropping

Text Spotting

  • Using TExt Spotting TRansformers (TESTR), a generic end-to-end text spotting framework using Transformers for text detection and recognition in the wild.
  • TESTR is particularly effective when dealing with curved text-boxes where special cares are needed for the adaptation of the traditional bounding-box representations.
  • Different from OCR, which is adept at extracting separate words, where here we are interested in FULL LOCATION PHRASES

Human Annotations

SynMap+ Training Data

  • Use a generative adversarial style transfer network (CycleGAN) to convert an OSM image to the historical style,
  • Associate the font, style and placement strategy according to the underlying geographical features
  • Use "rule-based" labeling from the QGIS PAL API to place the text labels on the synthetic map background,
  • designed an approach to automatically generate the polygon, centerline and local height annotation for the text labels.

SynthText Training Data

PatchTextSpotterPatchtoMapMerging

Coordinate Converter

Coordinate Converter

PostOCR & Entity Linker

Next Steps: Build Community

Next Steps: Build Software

"This past summer, a CESTA intern georeferenced 100+ [19th century] South African maps (at farm level). It would be great to harvest the place names and make them digitally searchable. Is that something we could discuss please?"

-Grant Parker, Classics Dept.

Why let

Machines Read Maps?

Metadata search is insufficient

The infrastructure that those maps are served from is well suited to this work

There are millions of scanned maps available, publicly, now.

Modern data is insufficient

Existing spatial data sources only contain information about the present (modern placenames), but even those are incomplete...

Source: GNS, National Geospatial-Intelligence Agency

More Info:

Who

Machines Reading Maps (MRM) is a collaborative project of

The project is funded by the

The Data