Data Driven Pages and Python's Multiprocessing
Utilizing Data Driven Pages in arcpy is a great way to export a map series quickly. But what if it isn't fast enough?
Spokane County Assessors Department maintains a land ownership map series of 3600 pages. Updating these maps takes far too long.
Platform | Time to produce |
---|---|
ArcMap (w/ ArcObjects) | 15 hours+ |
arcpy.mapping | 9-12 hours |
?? | ?? |
mxd = arcpy.mapping.MapDocument("C:\\Map.mxd")
#Create DDP object
ddp = mxd.dataDrivenPages
#Export PDF for each page in MXD
for pageNum in range(1, mxd.dataDrivenPages.pageCount + 1):
#set current page
mxd.dataDrivenPages.currentPageID = pageNum
#export!
ddp.exportToPDF("C:\\" + pageNum ,"CURRENT","PDF_SINGLE_FILE")
Basic DDP operation
Future Note: (DDP is now 'Map Series' and mapping is arcpy.mp @ ArcGIS Pro)
With Multiprocessing we can speed this up!
Platform | Time to produce |
---|---|
ArcMap (GUI) | 15 hours |
arcpy.mapping | 9-12 hours |
multiprocessing.Pool(11) | 2 hours |
What is Multiprocessing?
multiprocessing is a package that supports spawning processes.
home construction analogy:
A job site has: Blueprints, a Crew and a Boss
def exportMaps((map, pageList)):
# Data Driven Pages export
mapRanges = (
[MXD, list(range(1, 355))],
[MXD, list(range(355, 710))],
[MXD, list(range(710, 1065))],
[MXD, list(range(1065, 1420))],
[MXD, list(range(1420, 1775))],
[MXD, list(range(1775, 2130))],
[MXD, list(range(2130, 2485))],
[MXD, list(range(2485, 2840))],
[MXD, list(range(2840, 3195))],
[MXD, list(range(3195, 3587))]
)
def createmaps_handler():
p = multiprocessing.Pool(10)
p.map(exportMaps, mapRanges)
#Initiate script by running handler function
if __name__ == '__main__':
createmaps_handler()
Boss:
createmaps_handler gathers the crew and assigns work
Crew:
mapRanges splits the work out into equal units. These units will be fed to individual processes.
Blueprints:
exportMaps holds the work: map creation.
calling createmaps_handler will spin out the processes
BAM!!
Multiprocessing runs optimally when work is spread equally
Initial tests were disappointing because the most time consuming maps were not equally distributed between workers.
Maps: 1-355 Duration: 1:31
356-710 1:34
711-1065 1:35
1066-1420 1:36
1421-1775 2:05
1776-2130 3:33
2131-2485 2:33
2486-2840 1:36
2841-3195 1:25
3196-3587 1:21
Urban Maps
Rural Maps
Maps: 1-355 Duration: 1:31
356-710 1:34
711-1065 1:35
1066-1420 1:36
1421-1775 2:05
1776-2130 3:33
2131-2485 2:33
2486-2840 1:39
2841-3195 1:25
3196-3587 1:21
Unequal Spread
Equal Spread
Maps: 1-355 Duration: 1:52
356-710 1:52
711-1065 1:57
1066-1420 2:08
1421-1775 1:57
1776-2130 2:02
2131-2485 1:54
2486-2840 1:58
2841-3195 1:52
3196-3587 2:01
Thanks!
Phil Larkin
pslarkin@spokanecounty.org
https://slides.com/psl/multi_arcpy/
multiprocessing and arcpy
By psl
multiprocessing and arcpy
Using Python's multiprocessing module can help speed up arcpy's data driven pages operations.
- 824