asyncio

in 10 minutes.

John Liu @

johnliu55tw @

What is asynchronous I/O?

  • Do something else while waiting for I/O
    • Because I/O is slow
  • Handle multiple I/O at a time
    • A HTTP server handles multiple clients

How can we do that?

  • With multi-threads/processes
  • Programming language abstraction
    • ​JavaScript, Go, ...
  • OS Multiplexer
    • select, poll, epoll, kqueue... 

What is asyncio?

  • Standard module for async. I/O since Python 3.4
  • async, await syntax since Python 3.5
  • Consist and define...
    • Eventloop
    • Task, Future and Coroutine
    • Task functions (Arranging tasks)
    • Transports and Protocols
    • ...

Huh? What?

  • Multi-t​hreads/processes is hard
  • Control the scheduler (sort of)
  • Look like synchronous code
  • Avoid callback hell

Let's try something...

Fetching multiple tracks from

KKBOX Open API using several track IDs

  • A text file that contains track_ids separated by newline
  • Fetch their corresponding track name and artist
  • Time the process for 20, 40 and 60 tracks
    • Synchronously and Asynchronously

Fetch track information

 - Synchronously

def fetch_track(track_id):
    resp = requests.get(
            'https://api.kkbox.com/v1.1/tracks/'+track_id,
            params={'territory': 'TW'},
            headers={'Authorization': 'Bearer ' + TOKEN})
    return resp.json()

Fetch multiple tracks

 - Synchronously

def fetch_tracks_briefly(track_ids):
    results = list()
    for track_id in track_ids:
        track_info = fetch_track(track_id)
        results.append((
            track_info['id'],
            track_info['name'],
            track_info['album']['artist']['name']))
    return results

Run and time it

 - Synchronously

def main():
    with open('60_tracks', 'r') as f:
        track_ids = [track_id.strip() for track_id in f]

    start = time.time()
    results = fetch_tracks_brief(track_ids)
    end = time.time()

    for track_brief in results:
        print('{}    {}    {}'.format(*track_brief))
    print('Fetched {} tracks in {:.3f} seconds.'.format(
        len(results), end-start))

Time

 - Synchronously

Number of track Time (sec)
20 tracks 1.97
40 tracks 3.94
60 tracks 6.23

Fetch track information

 - Synchronously

async def fetch_track(track_id):
    async with aiohttp.request('GET',
            'https://api.kkbox.com/v1.1/tracks/'+track_id,
            params={'territory': 'TW'},
            headers={'Authorization': 'Bearer ' + TOKEN}) as resp:
        return await resp.json()
def fetch_track(track_id):
    resp = requests.get(
            'https://api.kkbox.com/v1.1/tracks/'+track_id,
            params={'territory': 'TW'},
            headers={'Authorization': 'Bearer ' + TOKEN})
    return resp.json()

 - Asynchronously

Fetch multiple tracks

async def fetch_tracks_brief(track_ids):
    results = list()
    futures = asyncio.as_completed(
            [fetch_track(track_id) for track_id in track_ids])
    for future in futures:
        track_info = await future
        results.append((
            track_info['id'],
            track_info['name'],
            track_info['album']['artist']['name']))
    return results

 - Asynchronously

def fetch_tracks_briefly(track_ids):
    results = list()


    for track_id in track_ids:
        track_info = fetch_track(track_id)
        results.append((
            track_info['id'],
            track_info['name'],
            track_info['album']['artist']['name']))
    return results

 - Synchronously

Run and time it

 - Asynchronously

def main():
    with open('60_tracks', 'r') as f:
        track_ids = [track_id.strip() for track_id in f]
    loop = asyncio.get_event_loop()
    start = time.time()
    results = loop.run_until_complete(
            fetch_tracks_brief(track_ids))
    end = time.time()

    for track_brief in results:
        print('{}    {}    {}'.format(*track_brief))
    print('Fetched {} tracks in {:.3f} seconds.'.format(
        len(results), end-start))
def main():
    with open('60_tracks', 'r') as f:
        track_ids = [track_id.strip() for track_id in f]

    start = time.time()
    results = fetch_tracks_brief(TOKEN, track_ids)

    end = time.time()

    for track_brief in results:
        print('{}    {}    {}'.format(*track_brief))
    print('Fetched {} tracks in {:.3f} seconds.'.format(
        len(results), end-start))

 -Synchronously

Time

Number of track Time (sec)
20 tracks 0.15
40 tracks 0.20
60 tracks 0.24

 - Asynchronously

Comparison

With 3 lines of additional code!

What we've missed...

  • Error handling
  • Transports and Protocols
  • Executor
  • Testing

More information

Documents and articles

Videos

@johnliu55tw

Python asyncio in 10 minutes

By Hsin-Wu Liu (John)

Python asyncio in 10 minutes

Understanding Python asyncio module in 10 minutes.

  • 516