David Taylor
"prooffreader"
data scientist, blogger, pythonista, nerd
thatsthejoke.jpg
Serial computing
Do a thing
Do another thing
Do a third thing
Three things are done!
Parallel computing
Do a thing
Do another thing
Do a third thing
Three things are done!
START
START
SPLIT
COMBINE
Code
Interpreter
Program/Bytecode
Kernel
Process
Thread
.py
/usr/bin/python
.pyc
OS
RAM
Code
Interpreter
Program/Bytecode
Kernel
Process
Thread
.py
/usr/bin/python
.pyc
OS
RAM
Thread
Thread
Code
Interpreter
Program/Bytecode
Kernel
Process
Thread
.py
/usr/bin/python
.pyc
OS
RAM
Thread
Thread
Process
Process
Python:
GIL
(Global Interpreter Lock)
Only one thread for most tasks
Therefore, only one process
(alternative: concurrent.futures in Py 3)
cores
time
(sec)
30,000 items
1 operation per number
30,000 items
20 operations per number
1,000,000 items
1 operation per number
(also PyPy/JIT/RPython)
Normal CPython code
Compiled CPython Code
56% faster
Compiled Cython Code
47X faster!
e.g. Numpy
Desktop
pymongo, boto, pysftp
Digital Ocean
MongoDB server
AWS
20 instances +
Desktop
pymongo, boto, pysftp
Digital Ocean
MongoDB server
AWS
20 instances +