Script Workflow
for Automating Historical Street Address Recognition
in Python
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532765/Screen_Shot_2016-04-26_at_8.43.20_AM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532766/Screen_Shot_2016-04-26_at_8.43.07_AM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532768/Screen_Shot_2016-04-26_at_8.44.09_AM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532770/Screen_Shot_2016-04-26_at_8.44.42_AM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532772/Screen_Shot_2016-04-26_at_8.45.39_AM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532773/Screen_Shot_2016-04-26_at_8.50.05_AM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532803/Screen_Shot_2016-04-26_at_9.02.50_AM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532808/Screen_Shot_2016-04-26_at_9.04.19_AM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532809/Screen_Shot_2016-04-26_at_9.04.47_AM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532810/Screen_Shot_2016-04-26_at_9.05.14_AM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532812/Screen_Shot_2016-04-26_at_9.05.31_AM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532813/Screen_Shot_2016-04-26_at_9.06.37_AM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532814/Screen_Shot_2016-04-26_at_9.07.00_AM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532815/Screen_Shot_2016-04-26_at_9.07.16_AM.png)
1.
2.
3.
4.
5.
6.
7.
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532896/Screen_Shot_2016-04-26_at_9.34.30_AM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532896/Screen_Shot_2016-04-26_at_9.34.30_AM.png)
"Tray" (Python list):
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532924/Screen_Shot_2016-04-26_at_9.42.15_AM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532896/Screen_Shot_2016-04-26_at_9.34.30_AM.png)
"Tray" (Python list):
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532924/Screen_Shot_2016-04-26_at_9.42.15_AM.png)
Option 1: "Pin" down street names first by identifying strings with no digits or digits + 'th' | 'st' etc. (12th, 31st)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532997/Screen_Shot_2016-04-26_at_9.54.46_AM.png)
Address 1
Problem:
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532997/Screen_Shot_2016-04-26_at_9.54.46_AM.png)
Address 1
Not Address 2
Not Address 2
Problem:
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532997/Screen_Shot_2016-04-26_at_9.54.46_AM.png)
Address 1
Not Address 2
Not Address 2
Option 2: Walk through address tray left to right, identify when an address is complete, and assign address elements to first empty Address 1, Address 2, etc. cell in a new data container.
Problem:
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2532997/Screen_Shot_2016-04-26_at_9.54.46_AM.png)
AddNum1 | Add1 |AddNum2 | Add2 | AddNum3 | AddNum3
Integer?
Y
Append to leftmost empty AddNum container
N
"Hinge" word?
(c., corner, rear, between, etc.)
Y
Append to leftmost (un)occupied Add container
and continue appending subsequent elements until arrive at a cardinal element (st., lower, W., E. etc.) or an integer.
N
Street name?
Append to leftmost empty Add container and continue appending...
AddNum1 | Add1 |AddNum2 | Add2 | AddNum3 | AddNum3
Integer?
Y
Append to leftmost empty AddNum container
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2534548/Screen_Shot_2016-04-26_at_2.47.22_PM.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2534620/Screen_Shot_2016-04-26_at_2.59.15_PM.png)
Are there any AddNum containers without an Add? If yes, distribute the streets to empty cells
![](https://s3.amazonaws.com/media-p.slid.es/uploads/304214/images/2534632/Screen_Shot_2016-04-26_at_3.01.58_PM.png)
hist-addresses
By Nicholas Wolf
hist-addresses
- 542