Demystifying Python AsyncIO 🥷🏻

Concurrency in Python ↔️

  • Multithreading
  • MultiProcessing
  • Asyncio ( 2012 )

Let's Start with Threads 🧵

import threading 
import time

def alice():
	while True:
		time.sleep(1)
		print("Hi there, this is Alice.")

def bob():
	while True:
		time.sleep(0.8)
		print("Hi there, this is Bob.")

threading.Thread(target=alice).start()
threading.Thread(target=bob).start()

Let's Start with Threads 🧵

counter = 0

def func1():
	global counter
	while True:
		counter += 1
		counter -= 1

def func2():
	global counter
	while True:
	counter += 1
	counter -= 1

threading.Thread(target=func1).start() 
threading.Thread(target=func2).start ()

Let's Start with Threads 🧵

counter = 0

def func1():
	global counter
	while True:
		counter += 1
		counter -= 1

def func2():
	global counter
	while True:
	counter += 1
	counter -= 1

threading.Thread(target=func1).start() 
threading.Thread(target=func2).start ()
def printer():
	while True:
	print(counter)
	time.sleep(1)
    
threading.Thread(target=printer).start()

Let's Start with Threads 🧵

counter = 0
lock = threading.Rlock()

def func1():
	global counter
	while True:
		counter += 1
		counter -= 1

def func2():
	global counter
	while True:
	counter += 1
	counter -= 1

threading.Thread(target=func1).start() 
threading.Thread(target=func2).start ()

What's Wrong With Threads 🤦🏻‍♂️

  • Synchronization is required when accessing shared data structures
    • Choosing the right locking granularity is hard.
    • Risk of deadlocking.
  • (Threads have some overhead: memory, context switching.)

 

 

-> Instead of using locks, people often use queues (message passing) for inter- thread communication.

Same code with asyncio

import asyncio 
import time
counter = 0

async def func1():
	global counter
    
	while True:
		counter †= 1
		counter -= 1 
    	await asyncio.sleep(0)
        
asyno def func2():
	global counter
	while True:
		counter †= 1
		counter -= 1 
        await asyncio.sleep (0)
asyncio.gather(func1(), func2()) 
asyncio.get event_ loop().run_forever ()

Await ♻️

-> Checkpoint where it's safe for asyncio to go to another coroutine

Threads vs Coroutines 🤺

Threads Coroutines
Pre-emptive: Co-operative:
The operating system decides when to context switch to another task The tasks themselves decide when to hand over control.
Can switch to other thread at any point in time. Only switches to other coroutine when there's an "await"
Tell when it's impossible to go to another thread (using locks). Tell when it's possible to go to another thread (using await)

An abstraction on top of event loops

• Dispatch tab

I/O completion event Callback
File 1 ready for reading func1
Stdin ready for reading func2
Network socket 1 ready for writing func3
Received mouse event func4
Received keyboard event func5
Etc...

An abstraction on top of event loops

while True:

wait_for_any_fd_to_become_ready_()
handle_fd_callback ()

Event Loops: The Good things

  • Everything runs in one thread:
    • But only one callback at a time.
    • No complicated synchronization (data locking)
    • Little risk of deadlocking.
    • Easy to debug.
  • Handle many connections in parallel.
    • Idle connections barely consume anything in an event driven system.
  • Cheaper then one thread per connection.

Don't mix with blocking I/O

while True:

wait_for_any_fd_to_become_ready_()
handle_fd_callback ()

- Callbacks can't do any kind of blocking 1/0, like read()', recv()', etc..
- Instead, they should do it asynchronously, and register the file descriptor with a callback in the event loop.

Run a coroutine

async def getusers():
	users = await client.do query("select * from users")
    return users
asyncio.run(getusers())

Run another coroutine

async def getusers():
	users = await client.do query("select * from users")
	return users
    
 async def main():
 	users = await getusers()
    print(users)
 
asyncio.run(main())

Run two coroutine in parallel

async def getusers():
	users = await client.do query("select * from users")
	return users
    
 async def main():
 	await asyncio.gather(
    	getusers(),
        getusrs()
    )
 
asyncio.run(main())

Start coroutine without waiting

async def getusers():
	users = await client.do query("select * from users")
	return users
    
 async def main():
 	task = asyncio.create_task(get_user())
 
asyncio.run(main())

Start coroutine without waiting

async def getusers():
	users = await client.do query("select * from users")
	return users
    
 async def main():
 	task = asyncio.create_task(get_user())
 
asyncio.run(main())

Don't turn every call into an async call

• Imagine logging into a remote server.

• Logging can eventually happen in every function.

• So, every function needs to become async.

No: use an asynchronous queue:
- On one end, push the messages into the queue (queue.put_nowait)
- On the other end, have one coroutine consume the queue and flush to the remote server.

An Actual Example

import httpx
async def main():
	async with httpx.AsyncClient() as client:
	response = await client.get('https://example.com')
	print(response.text)

asyncio.run (main())

Conclusion

  • Asyncio is a great concurrency pattern for I/O heavy applications.
  • Not the easiest to begin with,
    but when things become complex, often easier than threading
  • Important pitfalls:
    • Don't mix with blocking 1/O.
    • Don't turn every function into an async function.

About me 😃

  • Kanishk Pachauri ( @itsKanishkP )
  • I speak ( Python, TypeScript, and Go ) 
  • I'm currently exploring Cpython
  • Upcoming Summer Intern @Google
  • Gsoc'22 @Python Software Foundation
  • Maintainer @Dateparser, and actively contributing to some PSF projects.
  • Founder and CM @FOSSCU (fosscu.org)

Thanks 

❤️

deck

By Kanishk Pachauri