Script Workflow
for Automating Historical Street Address Recognition
in Python
1.
2.
3.
4.
5.
6.
7.
"Tray" (Python list):
"Tray" (Python list):
Option 1: "Pin" down street names first by identifying strings with no digits or digits + 'th' | 'st' etc. (12th, 31st)
Address 1
Problem:
Address 1
Not Address 2
Not Address 2
Problem:
Address 1
Not Address 2
Not Address 2
Option 2: Walk through address tray left to right, identify when an address is complete, and assign address elements to first empty Address 1, Address 2, etc. cell in a new data container.
Problem:
AddNum1 | Add1 |AddNum2 | Add2 | AddNum3 | AddNum3
Integer?
Y
Append to leftmost empty AddNum container
N
"Hinge" word?
(c., corner, rear, between, etc.)
Y
Append to leftmost (un)occupied Add container
and continue appending subsequent elements until arrive at a cardinal element (st., lower, W., E. etc.) or an integer.
N
Street name?
Append to leftmost empty Add container and continue appending...
AddNum1 | Add1 |AddNum2 | Add2 | AddNum3 | AddNum3
Integer?
Y
Append to leftmost empty AddNum container
Are there any AddNum containers without an Add? If yes, distribute the streets to empty cells
hist-addresses
By Nicholas Wolf
hist-addresses
- 621