Asynchronous Task Queues

INFO 153B/253B: Backend Web Architecture

Kay Ashaolu

First a step back

  • At this point you have learned most of what you need to learn to build a real backend system that can take requests, process data, and return results
  • However, once we want to deploy our services to the real world, we find that there are a number of real challenges that occur outside of your safe development space

What if?

  • We have experience now building a server that listens via HTTP to requests
  • We have the tooling that typically responds to those requests really quickly
  • However, what would happen if we wanted to add actions that either
    • takes a long time to process or
    • can fail routinely?

Example: our quote service

  • Very fast since it uses a local dictionary (or a direct connection to a Postgres database) to store and retrieve quotes for each day
  • However what if we wanted to find quotes from all across the internet
  • We could imagine that that could take a long time to
    • retrieve all pages for a website and
    • figure out what is a quote
    • save it to the database¬†
    • (this process could take minutes or even hours!)

Image waiting 1 hr for a page to load

  • This is not terribly farfetched - we are in the world of big data
  • Data pipeline runs can easily take several hours to days to complete
  • The HTTP connection from the client to your server will timeout before it is completed
  • What do you do?

Asynchronous Task Queues

  • Provides the ability to execute work asynchronously
  • Application submits a task to be on a queue for a another process to pick it up and do the work
  • Application can do other work while task is being completed
  • Application can be notified when task is complete

Quick note: Asynchronous vs Synchronous

  • Most of the code we have been writing so far has been synchronous: our applications run from beginning to end - the next line of code waits for the previous line of code to complete before moving on
  • Asynchronous code does not wait for the previous line to complete: the task is completed in parallel to the main application
  • Typically there is a "callback" or way for the main application to be notified when the task is done (if necessary)

For example: word count web crawler

  • When we get a word from our server, the server puts a message in the asynchronous task queue to go find the count for that word
  • A worker picks up the task from the task queue and continues to work
  • Our server tells the user that their job was scheduled
  • When the task is done, the task writes the result in the database
  • Our application can read that data and present it to the user

Asynchronous Task Queue

Other benefits of asynchronous task queues

  • Isolation of unexpected results from external API calls. When you use an external API, you don't know if it's down or slow. Having your server directly request data fro API calls can make it subject to the state of the external system

Other benefits of asynchronous task queues

  • Better handling of errors from external APIs. Because work is being done in a separate isolated process, we can 1) retry the request later, 2)¬† silently discard the task that we were trying to do, or a combination of both without affecting our API speed and reliability

Other benefits of asynchronous task queues

  • Better able to handle large amounts of requests. If we had a surge of requests our HTTP server could go down. However if we are simply scheduling messages to our asynchronous task queue, then our task queue could handle the surge and process outside of our application server. This provides a better user experience and more resistance to errors

Other benefits of asynchronous task queues

  • True decoupling of systems. If we had two systems that communicated through a asynchronous task queue, if one goes down, messages would simply accumulate in the queue
  • Once the system is back up, it can resume processing those tasks

Which technologies are used?

  • Typically there are two pieces of technologies that are necessary to implement an Asynchronous task queue:
    • A queue. A service that can save messages in order in different buckets
    • A task runner. A library that enables processes to pick up messages from a queue and execute a predefined piece of code (a function!) using the message as input

Quick note

  • Yet another application for the function!
  • Queue message contains everything necessary to execute a predefined function
    • name of function
    • input parameters
    • how to handle expected output

Introducing Celery

  • Celery is a Python framework (just like Flask is)
  • Celery is a framework for implementing Asynchronous Task Queues
  • Celery relies on two technologies
    • A message broker
    • (optional) A database to store results of workers

Word count from URL

Word count from URL

version: "3.7"
services:
  job_broker:
      image: redis:latest
      container_name: job_broker

  job_db:
      image: postgres:latest
      container_name: job_db
      environment:
        POSTGRES_USER: dbc
        POSTGRES_PASSWORD: dbc
        POSTGRES_DB: celery
      volumes:
        - postgres-volume:/var/lib/postgresql/data

  job_worker_01:
      build: ./job_worker/.
      container_name: job_worker_01
      volumes:
        - ./job_tasks/job_tasks.py:/app/job_tasks.py
      environment:
        CELERY_BROKER_URL: redis://job_broker:6379
        CELERY_RESULT_BACKEND: db+postgresql://dbc:dbc@job_db:5432/celery
        NLTK_DATA: /nltk_data
      depends_on:
        - job_broker
        - job_db

  job_worker_02:
      build: ./job_worker/.
      container_name: job_worker_02
      volumes:
        - ./job_tasks/job_tasks.py:/app/job_tasks.py
      environment:
        CELERY_BROKER_URL: redis://job_broker:6379
        CELERY_RESULT_BACKEND: db+postgresql://dbc:dbc@job_db:5432/celery
        NLTK_DATA: /nltk_data
      depends_on:
        - job_broker
        - job_db

  job_manager:
    build: ./job_manager/
    image: job-manager-image
    volumes:
        - ./job_tasks/job_tasks.py:/app/job_tasks.py
    environment:
      CELERY_BROKER_URL: redis://job_broker:6379
      CELERY_RESULT_BACKEND: db+postgresql://dbc:dbc@job_db:5432/celery
    ports:
      - "5050:5000"
    depends_on:
        - job_broker
        - job_db
        - job_worker_01
        - job_worker_02
  job_viewer:  
    image: mher/flower
    environment:
      - CELERY_BROKER_URL=redis://job_broker:6379
      - FLOWER_PORT=8888
    ports:  
      - 8888:8888
    depends_on:
        - job_broker
volumes:
  postgres-volume: {}

Word count from URL

job_worker_02:
      build: ./job_worker/.
      container_name: job_worker_02
      volumes:
        - ./job_tasks/job_tasks.py:/app/job_tasks.py
      environment:
        CELERY_BROKER_URL: redis://job_broker:6379
        CELERY_RESULT_BACKEND: db+postgresql://dbc:dbc@job_db:5432/celery
        NLTK_DATA: /nltk_data
      depends_on:
        - job_broker
        - job_db

  job_manager:
    build: ./job_manager/
    image: job-manager-image
    volumes:
        - ./job_tasks/job_tasks.py:/app/job_tasks.py
    environment:
      CELERY_BROKER_URL: redis://job_broker:6379
      CELERY_RESULT_BACKEND: db+postgresql://dbc:dbc@job_db:5432/celery
    ports:
      - "5050:5000"
    depends_on:
        - job_broker
        - job_db
        - job_worker_01
        - job_worker_02

Word count from URL

 job_viewer:  
    image: mher/flower
    environment:
      - CELERY_BROKER_URL=redis://job_broker:6379
      - FLOWER_PORT=8888
    ports:  
      - 8888:8888
    depends_on:
        - job_broker
volumes:
  postgres-volume: {}

Questions?