Demystifying Python AsyncIO 🥷🏻
Concurrency in Python ↔️
- Multithreading
- MultiProcessing
- Asyncio ( 2012 )
Let's Start with Threads 🧵
import threading
import time
def alice():
while True:
time.sleep(1)
print("Hi there, this is Alice.")
def bob():
while True:
time.sleep(0.8)
print("Hi there, this is Bob.")
threading.Thread(target=alice).start()
threading.Thread(target=bob).start()
Let's Start with Threads 🧵
counter = 0
def func1():
global counter
while True:
counter += 1
counter -= 1
def func2():
global counter
while True:
counter += 1
counter -= 1
threading.Thread(target=func1).start()
threading.Thread(target=func2).start ()
Let's Start with Threads 🧵
counter = 0
def func1():
global counter
while True:
counter += 1
counter -= 1
def func2():
global counter
while True:
counter += 1
counter -= 1
threading.Thread(target=func1).start()
threading.Thread(target=func2).start ()
def printer():
while True:
print(counter)
time.sleep(1)
threading.Thread(target=printer).start()
Let's Start with Threads 🧵
counter = 0
lock = threading.Rlock()
def func1():
global counter
while True:
counter += 1
counter -= 1
def func2():
global counter
while True:
counter += 1
counter -= 1
threading.Thread(target=func1).start()
threading.Thread(target=func2).start ()
What's Wrong With Threads 🤦🏻♂️
- Synchronization is required when accessing shared data structures
- Choosing the right locking granularity is hard.
- Risk of deadlocking.
- (Threads have some overhead: memory, context switching.)
-> Instead of using locks, people often use queues (message passing) for inter- thread communication.
Same code with asyncio
import asyncio
import time
counter = 0
async def func1():
global counter
while True:
counter †= 1
counter -= 1
await asyncio.sleep(0)
asyno def func2():
global counter
while True:
counter †= 1
counter -= 1
await asyncio.sleep (0)
asyncio.gather(func1(), func2())
asyncio.get event_ loop().run_forever ()
Await ♻️
-> Checkpoint where it's safe for asyncio to go to another coroutine
Threads vs Coroutines 🤺
Threads | Coroutines |
---|---|
Pre-emptive: | Co-operative: |
The operating system decides when to context switch to another task | The tasks themselves decide when to hand over control. |
Can switch to other thread at any point in time. | Only switches to other coroutine when there's an "await" |
Tell when it's impossible to go to another thread (using locks). | Tell when it's possible to go to another thread (using await) |
An abstraction on top of event loops
• Dispatch tab
I/O completion event | Callback |
---|---|
File 1 ready for reading | func1 |
Stdin ready for reading | func2 |
Network socket 1 ready for writing | func3 |
Received mouse event | func4 |
Received keyboard event | func5 |
Etc... |
An abstraction on top of event loops
while True:
wait_for_any_fd_to_become_ready_()
handle_fd_callback ()
Event Loops: The Good things
- Everything runs in one thread:
- But only one callback at a time.
- No complicated synchronization (data locking)
- Little risk of deadlocking.
- Easy to debug.
- Handle many connections in parallel.
- Idle connections barely consume anything in an event driven system.
- Cheaper then one thread per connection.
Don't mix with blocking I/O
while True:
wait_for_any_fd_to_become_ready_()
handle_fd_callback ()
- Callbacks can't do any kind of blocking 1/0, like read()', recv()', etc..
- Instead, they should do it asynchronously, and register the file descriptor with a callback in the event loop.
Run a coroutine
async def getusers():
users = await client.do query("select * from users")
return users
asyncio.run(getusers())
Run another coroutine
async def getusers():
users = await client.do query("select * from users")
return users
async def main():
users = await getusers()
print(users)
asyncio.run(main())
Run two coroutine in parallel
async def getusers():
users = await client.do query("select * from users")
return users
async def main():
await asyncio.gather(
getusers(),
getusrs()
)
asyncio.run(main())
Start coroutine without waiting
async def getusers():
users = await client.do query("select * from users")
return users
async def main():
task = asyncio.create_task(get_user())
asyncio.run(main())
Start coroutine without waiting
async def getusers():
users = await client.do query("select * from users")
return users
async def main():
task = asyncio.create_task(get_user())
asyncio.run(main())
Don't turn every call into an async call
• Imagine logging into a remote server.
• Logging can eventually happen in every function.
• So, every function needs to become async.
No: use an asynchronous queue:
- On one end, push the messages into the queue (queue.put_nowait)
- On the other end, have one coroutine consume the queue and flush to the remote server.
An Actual Example
import httpx
async def main():
async with httpx.AsyncClient() as client:
response = await client.get('https://example.com')
print(response.text)
asyncio.run (main())
Conclusion
- Asyncio is a great concurrency pattern for I/O heavy applications.
- Not the easiest to begin with,
but when things become complex, often easier than threading - Important pitfalls:
- Don't mix with blocking 1/O.
- Don't turn every function into an async function.
About me 😃
- Kanishk Pachauri ( @itsKanishkP )
- I speak ( Python, TypeScript, and Go )
- I'm currently exploring Cpython
- Upcoming Summer Intern @Google
- Gsoc'22 @Python Software Foundation
- Maintainer @Dateparser, and actively contributing to some PSF projects.
- Founder and CM @FOSSCU (fosscu.org)
Thanks
❤️
deck
By Kanishk Pachauri
deck
- 78