Python Multithreading and Multiprocessing: Concurrency and Parallelism
About Me
-
Pythonist, Gopher
A story!
Aspected time
50 Min
A story!
A story!
Aspected time
30 Min
A story!
Aspected time
15 Min
Parralism
Parallelism is to add more hardware or software resources to make computation faster.
Parallelism is about doing lots of things at once
Concurrency
Permit multiple tasks to proceed without waiting for each other.
Concurrency is about dealing with lots of things at once
Concurrency & Parralism
Concurrency & Parralism
Concurrency & Parralism
Concurrency & Parralism
Multiprocessing & Multithreading is not easy!
Multiprocessing & Multithreading
- Ability of a central processing unit or a single core in a multi-core processor to execute multiple processes or threads concurrently
Python threading
- Higher level threading interface on top of thread of lower level thread module
-
In python 3 lower level thread module renamed to _thread
Start thread
from threading import Thread
def print_hello():
print("Hello World")
t = Thread(target=print_hello, args=[])
t.start()
t.join()
Thread start
- Threads are system level
- Linux POSIX & Windows threads
- Scheduling
Thread start
from threading import Thread
def calc(n):
while n > 0:
n -= 1
t = Thread(target=calc, args=[100000000])
t.start()
t.join()
# Started a single thread
# 4.02 seconds to execute this program
from threading import Thread
def calc(n):
while n > 0:
n -= 1
t1 = Thread(target=calc, args=[100000000])
t2 = Thread(target=calc, args=[100000000])
t1.start()
t2.start()
t1.join()
t2.join()
# How much time this program would take to
# complete execution ?
Thread start
from multiprocessing import Process
from threading import Thread
def calc(n):
while n > 0:
n -= 1
def do_calc():
t = Thread(target=calc, args=[100000000])
t.start()
t.join()
p1 = Process(target=do_calc, args=[])
p2 = Process(target=do_calc, args=[])
p1.start()
p2.start()
p1.join()
p2.join()
Global Interpreted Lock
- Parallel execution of thread forbidden.
- Release lock on read, write, send, recv
Another boring story
Global Interpreted Lock
- GIL release on IO operations
- What about CPU bound threads ?
Global Interpreted Lock
Global Interpreted Lock
- sys.setcheckinterval()
- Scheduling will be managed by OS
OS
A story of my boss!
A story of my boss!
Thread pool
from multiprocessing.pool import ThreadPool
def f(x):
return x*x
if __name__ == '__main__':
p = ThreadPool(4)
print(p.map(f, [1, 2, 3, 4, 5, 6]))
# Output
[1, 4, 9, 16, 25, 36]
Process multiprocessing
from multiprocessing import Process
def f(name):
print 'hello world'
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
- Python module for process API
- System processes
- GIL ?
- Linux & Windows
- Own memory
Process pool
- Similar to threadpool with difference that it will start processes instead of threads
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
p = Pool(5)
print(p.map(f, [1, 2, 3, 4, 5, 6]))
How to handle this ?
- Multiple resource trying to access same code or resource
- Synchronization thread/process
Deadlock
Semaphore
- Solution to handle deadlock
- Semaphore types
- Binary
- Counter
- Mutex
Lock & RLock
- Known as Mutex or Binary semaphore
- release & acquire
- Available in multiprocessing/threading module
def getPart1():
lock.acquire()
try:
... get first part of the data
finally:
lock.release()
def getPart2():
lock.acquire()
try:
... get first part of the data
finally:
lock.release()
Lock & Rlock
semaphore
import time
from threading import Thread, Semaphore
semaphore = Semaphore(5)
def calc():
semaphore.acquire()
time.sleep(3)
# Send http request to http://hitul.in
semaphore.release()
for _ in range(10):
t1 = Thread(target=calc, args=[])
t1.start()
t1.join()
- python implementation of counter semaphore
- Available in multiprocessing module
semaphore
BoundedSemaphore
import time
import requests
from threading import Thread, BoundedSemaphore
semaphore = BoundedSemaphore(3)
def send_request():
semaphore.acquire()
r = requests.get("http://hitul.in/")
time.sleep(3)
print("Request sent")
semaphore.release()
# Value error
semaphore.release()
for _ in range(10):
t1 = Thread(target=send_request, args=[])
t1.start()
t1.join()
- Value error will be raised if release and acquire are not same
Event
- Thread will wait for flag
import time
from threading import Thread, Event
event = Event()
def hello():
event.wait()
print("Hello world")
t1 = Thread(target=hello, args=[])
t1.start()
t2 = Thread(target=hello, args=[])
t2.start()
time.sleep(5)
event.set()
event.clear()
t1.join()
t2.join()
Timer
from threading import Timer
def hello():
print("Hello World")
t = Timer(2, hello)
t.start()
- Run thread after some seconds
- Delay can be there
- cancel()
Thread example
from threading import Thread
data = {}
def print_hello():
data["Status"] = True
t = Thread(target=print_hello, args=[])
t.start()
t.join()
print data
Process example
from multiprocessing import Process
data = {}
def print_hello():
data["Status"] = True
p = Process(target=print_hello, args=[])
p.start()
p.join()
print data
Pipes
- Data channel to transfer data
-
os.pipe
- 64 kb limit
- encode/decode
- works on linux & windows
-
multiprocessing.pipe
- sockets
- full duplex
- Pickle
Pipes
import time
from multiprocessing import Pipe, Process
c1, c2 = Pipe()
def write_pipe(c2):
c2.send("Hello world")
def read_pipe(c1):
print c1.recv()
p1 = Process(target=write_pipe, args=[c1])
p2 = Process(target=read_pipe, args=[c2])
p1.start()
time.sleep(1)
p2.start()
p1.join()
p2.join()
Pipe
import time
from multiprocessing import Pipe, Process
c1, c2 = Pipe()
def write_pipe(c2):
c2.send("Hey! How are you ?")
print "MSG by read-process : %s" % c2.recv()
def read_pipe(c1):
print "MSG by write-process function : %s" % c1.recv()
c1.send("Thanks you! I'm good.")
p1 = Process(target=write_pipe, args=[c1])
p2 = Process(target=read_pipe, args=[c2])
p1.start()
time.sleep(1)
p2.start()
p1.join()
p2.join()
Queue
from multiprocessing import Process, Queue
def f(q):
q.put([42, None, 'hello'])
if __name__ == '__main__':
q = Queue()
p = Process(target=f, args=(q,))
p.start()
print q.get() # prints "[42, None, 'hello']"
p.join()
- thread/process safe
- Queue types
- LIFO
- FIFO
- Priority
Shared memory
from multiprocessing import Process, Value, Array
def f(n, a):
n.value = 3.1415927
for i in range(len(a)):
a[i] = -a[i]
if __name__ == '__main__':
num = Value('d', 0.0)
arr = Array('i', range(10))
p = Process(target=f, args=(num, arr))
p.start()
p.join()
print num.value
print arr
-
Data structures supported
-
Values
-
Array
-
-
Python data structures wrappers
-
Thread/Process safe
Best practices
- Pass pickable objects to multiprocessing pipe
- Zombie process
- Avoid to terminate processes
- Global variable values can be differ
- Close pipes once done with it
- Use pipes with context manager
Contact me
@hitul007
http://hitul.in
Thank you
Copy of Multithreading & Multiprocessing
By Eder Rafo Jose Pariona Espiñal
Copy of Multithreading & Multiprocessing
- 815