An Introduction to Asynchronous Python

Josh Finnie

  • Started at Indigo in May
  • I am working on the Cerberus Squad
  • Have been coding Python for 10+ years
  • Current obsession is Fountain Pens!

Outline

  • Let's talk about the Global Interpreter Lock
  • Live asynchronous coding!
  • What is Asyncio and why I recommend it?
  • Asyncio Glossary of terms
  • Drawbacks of Asyncio and async programming

Asynchronous Python

Python - the GIL

What is the GIL

  • The Python Global Interpreter Lock or GIL is a mutex  that allows only one thread to hold the control of the Python interpreter.
  • It's a Python-specific specification that has been around since the beginning.
  • It protects the internals of the Python interpreter from concurrent access and modification from multiple threads.

https://realpython.com/python-gil/

Does this block us?

  • Yes and No...
  • Threading and Asyncio still runs on a single thread.
  • They just cheat a bit.

So There's hope!

Implementations to
"get around" the GIL

The GIL Busters

  • Threading
  • Asyncio
  • Multiprocessing

LIVE CODING!

The Menu

  • Normal IO Bound Function
  • Async IO Bound Function
  • Threaded IO Bound Function
  • Multi-Processed IO Bound Function
  • Normal CPU Bound Function
  • Threaded CPU Bound Function
  • Multi-Processed CPU Bound Function
  • Asyncio CPU Bound Function

Default

#!/usr/bin/env python3
import time

times = 0

def my_func():
    global times
    while times < 10:
        times += 1
        time.sleep(1)
        print(f'Number of times: {times}')


if __name__ == '__main__':
    s = time.perf_counter()
    my_func()
    elapsed = time.perf_counter() - s
    print(f"{__file__} executed in {elapsed:0.2f} seconds.")

Threading

#!/usr/bin/env python3
import threading
import time

times = 0

def my_func():
    #this is the same


if __name__ == '__main__':
    s = time.perf_counter()
    thread1 = threading.Thread(target=my_func, name='Thread 1')
    thread2 = threading.Thread(target=my_func, name='Thread 2')

    thread1.start()
    thread2.start()
    thread1.join()
    thread2.join()
    elapsed = time.perf_counter() - s
    print(f"{__file__} executed in {elapsed:0.2f} seconds.")

Multiprocessing

#!/usr/bin/env python3
import multiprocessing as mp
import time

times = 0

def my_func():
    global times
    while times < 10:
        times += 1
        time.sleep(1)
        print(f'Number of times: {times}\n    Added by: {__name__}')
    
    
if __name__ == '__main__':
    s = time.perf_counter()
    mp1 = mp.Process(target=my_func)
    mp2 = mp.Process(target=my_func)
    
    mp1.start()
    mp2.start()
    mp1.join()
    mp2.join()
    elapsed = time.perf_counter() - s
    print(f"{__file__} executed in {elapsed:0.2f} seconds.")

Asyncio

#!/usr/bin/env python3
import asyncio
import time

times = 0

async def my_func():
    global times
    name = __name__
    while times < 10:
        times += 1
        await asyncio.sleep(1)
        print(f'Number of times: {times}\n    Added by: {name}')


async def main():
    await asyncio.gather(my_func(), my_func())

if __name__ == '__main__':
    s = time.perf_counter()
    asyncio.run(main())
    elapsed = time.perf_counter() - s
    print(f"{__file__} executed in {elapsed:0.2f} seconds.")

Results

$ python3 default.py
default.py executed in 10.03 seconds.

$ python3 threading.py
threading.py executed in 5.03 seconds.

$ python3 mp.py
mp.py executed in 10.09 seconds. *

$ python3 async.py
async.py executed in 5.01 seconds.

Note: the Multiprocessing application looks like it took just as long as the normal application. This is because multiprocessing applications are more difficult to write (especially when dealing with read/write applications) and it was actually doing double the work!

Asyncio

This is a wonderful tool!

History

  • Asyncio was introduced in Python 3.4
  • Really hit its stride in Python 3.7
  • Introduced the async/await keywords to the language
  • It is the best practice for asynchronous python

 

"Use asyncio first, threading if you have to!"

 

 

The Event Loop

https://medium.com/@gauravbaliyan/a-good-use-of-asyncio-in-python-and-django-8aa7bc401e5f

Reasons to use Asyncio

To Start Async Programming

  • We want to move to asynchronous programming
    • task-based development
    • 3rd party async libraries

Partner Tasks

async def _fetch():
    for _ in range(0, 3):
        try:
            return await PartnerApi.ping()
        except Exception as e:
            logger.warning(f"Error: {e} Retrying...")
            await asyncio.sleep(0.5)
    return Exception(f"Failed multiple times. See warning logs for reason")


def ingest():
    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)
    response = loop.run_until_complete(_fetch())
    
    if isinstance(response, Exception):
      raise
    
    return "Done"

To Remove Complexity

  • We want to move away from the more complex threading/multiprocessing code.
  • Less to manage, which allows for less errors.
  • Asyncio is lightweight

async_task

def async_task(func: Callable):
    """
    Decorate an asychronous method to be used with celery.
    Requires the use of `start_background_event_loop`.
    Converts the method to a synchronous function that dispatches
    the asynchronous coroutine in an event loop running in a different thread.
    See: https://docs.python.org/3/library/asyncio-task.html#asyncio.run_coroutine_threadsafe
    Waits on the result and reraise any exceptions.
    """
    @wraps(func)
    def wrapper(*args: Any, **kwargs: Any):
        loop = local.loop
        # Lower level concurrent programming object - concurrent.futures
        # https://docs.python.org/3/library/concurrent.futures.html
        future = asyncio.run_coroutine_threadsafe(
          func(*args, **kwargs),
          loop=loop
        )

        return future.result()

    wrapper._wrapped = func
    return wrapper

To Use Non-Blocking IO

  • We want to stop blocking programming on IO.
  • As we heard with Ben's talk, Internet calls and Database calls can (and will) be slow.
  • We don't necessarily have to block our entire program waiting for IO calls.

Partner API

import aiohttp

class PartnerClient:
  async def get(self, path, headers=None):
    async with aiohttp.ClientSession(headers=self._headers()) as session:
       return await session.get(path, headers=headers)

async def ping(cls) -> bool:
    response = await PartnerClient().get(BASE_URL.replace("v1", "ping"))

    if response.status != 200:
        logger.warning(
           f'Partner returned code {response.status},
             content {await response.content.read()}'
        )
        return False

    return True

https://pypi.org/project/aiohttp/

Asyncio

Glossary

asyncio.run()

import asyncio

async def add(a, b):
  return a + b

asyncio.run(add(1, 2))

https://docs.python.org/3/library/asyncio-runner.html#asyncio.run

asyncio.get_event_loop()

import asyncio

async def add(a, b):
  return a + b

# Reference to the async loop
loop = asyncio.get_event_loop()

# Can use it multiple times
result = loop.run_until_complete(add(3, 4))
result2 = loop.run_until_complete(add(5, 6))

https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.get_event_loop

asyncio.Runner()

import asyncio

async def main():
    await asyncio.sleep(1)
    print('hello')

with asyncio.Runner() as runner:
    # Running functions in a Context gives your more control
    print(runner.getloop())
    runner.run(main())

https://docs.python.org/3/library/asyncio-task.html#id6

asyncio.gather()

import asyncio

async def add(a, b):
  return a + b

async def multi_add(a, b, c):
  # Runs these concurrently
  await asyncio.gather(
    add(a, b),
    abb(b, c),
    add(a, c),
  )

asyncio.run(multi_add(1, 3, 5))

https://docs.python.org/3/library/asyncio-task.html#asyncio.gather

asyncio.sleep()

import asyncio

async def main():
    print('hello')
    
    # This is different than time.sleep(1). Can be awaited!
    await asyncio.sleep(1)
    
    print('world')

asyncio.run(main())

https://docs.python.org/3/library/asyncio-task.html#id6

asyncio.wait()

import asyncio

async def main():
    # gives us insight into long running tasks!
    done, pending = await asyncio.wait(long_running_task())
    
    # create many copies of your task
    tasks = [asyncio.create_task(long_running_task(i)) for i in range(10)]
    # wait for the first task to be completed
    done, pending = await asyncio.wait(
      tasks,
      return_when=asyncio.FIRST_COMPLETED
    )
    # wait for the first task to failed
    done, pending = await asyncio.wait(
      tasks,
      return_when=asyncio.FIRST_EXCEPTION
    )

https://docs.python.org/3/library/asyncio-task.html#asyncio.wait

asyncio.timeout()

import asyncio

async def main():
    # will wait at most 10 seconds!
    async with asyncio.timeout(10):
        await long_running_task()

https://docs.python.org/3/library/asyncio-task.html#asyncio.Timeout

asyncio.wait_for()

import asyncio

async def main():
    try:
        await asyncio.wait_for(long_running_task(), timeout=10)
    except TimeoutError:
        # Can now do something else!
        print('timeout!')

https://docs.python.org/3/library/asyncio-task.html#asyncio.wait_for

asyncio Gotchas

  • Complicates code
  • Hard to debug when in the event loop
  • Accidentally asyncing is bad
  • Pretty steep learning curve
  • Race conditions can pop up

Resources

  • https://superfastpython.com/python-asyncio/
  • https://realpython.com/async-io-python/
  • https://www.linkedin.com/learning/python-parallel-and-concurrent-programming-part-1/learn-parallel-programming-basics
  • https://www.bmc.com/blogs/asynchronous-programming/