Script Workflow

for Automating Historical Street Address Recognition

in Python

1.

2.

3.

4.

5.

6.

7.

"Tray" (Python list):

"Tray" (Python list):

Option 1: "Pin" down street names first by identifying strings with no digits or digits + 'th' | 'st' etc. (12th, 31st)

Address 1

Problem:

Address 1

Not Address 2

Not Address 2

Problem:

Address 1

Not Address 2

Not Address 2

Option 2: Walk through address tray left to right, identify when an address is complete, and assign address elements to first empty Address 1, Address 2, etc. cell in a new data container.

Problem:

AddNum1 | Add1 |AddNum2 | Add2 | AddNum3 | AddNum3

Integer?

Y

Append to leftmost empty AddNum container

N

"Hinge" word?

(c., corner, rear, between, etc.)

Y

Append to leftmost (un)occupied Add container

and continue appending subsequent elements until arrive at a cardinal element (st., lower, W., E. etc.) or an integer.

N

Street name?

Append to leftmost empty Add container and continue appending...

AddNum1 | Add1 |AddNum2 | Add2 | AddNum3 | AddNum3

Integer?

Y

Append to leftmost empty AddNum container

Are there any AddNum containers without an Add? If yes, distribute the streets to empty cells

hist-addresses

By Nicholas Wolf

hist-addresses

  • 621