Celery Basics

Ivaylo Bachvarov

ivaylo@hacksoft.io

  • Python Library!
  • Used to  distribute work across threads or machines.
  • It is simple!
  • But very powerful.
  • Flexible. (with a lot of moving parts)

Typical Architecture

Producer

Broker with Queue

Consumer

Example Application

from celery import Celery
import requests

app = Celery('tasks')

@app.task
def get_air_quality(country):
    response = requests.get('https://maps.sensor.community/data/v2/data.dust.min.json')

    values = []
    for row in response.json():
        if row['location']['country'] == country:
            values += [datavalue['value'] for datavalue in row['sensordatavalues']]

    print(values)
$ pip install celery
$ celery -A tasks worker --loglevel=INFO  --concurrency=4

Advanced Architecture

Producer

Broker with Queue

Consumer

Consumer

Consumer

Serialization

Serialization

Producer

Broker with Queue

Consumer

Consumer

Consumer

Result Backends

Advanced Architecture
with Result Backend

Producer

Broker with Queue

Consumer

Result

Backend

from celery import Celery
import requests

app = Celery('tasks', backend='db+sqlite:///db.sqlite3')

@app.task
def get_air_quality(country):
    response = requests.get('https://maps.sensor.community/data/v2/data.dust.min.json')

    values = []
    for row in response.json():
        if row['location']['country'] == country:
            values += [datavalue['value'] for datavalue in row['sensordatavalues']]

    return values
>>> from tasks import get_air_quality
>>> result = get_air_quality.delay('BG')
>>> result
<AsyncResult: a530f615-9df0-42b6-b53c-3fe37d196add>
>>> result.status
'PENDING'
>>> result.status
'SUCCESS'
>>> result.result
[1, 2, 3, 4, 5]

Task signatures

>>> get_air_quality.s()  # This is serializable
tasks.get_air_quality()
>>> get_air_quality.s('BG')  # Supports task arguments
tasks.get_air_quality('BG')
>>> tasks.get_air_quality('BG').delay()  ## Can be delayed
<AsyncResult: b293ed39-04c5-4d95-bdb5-0b66a9187122>

Chain Chord and Group

from celery import chain

@app.task
def send_result_to_email(result):
  pass

# Send the result via email
chain(get_air_quality.s('BG'), send_result_to_email.s())

Chain Chord and Group

from celery import chain

@app.task
def send_result_to_email(result):
  pass

# Send the result via email
chain(get_air_quality.s('BG'), send_result_to_email.s())

Group + chain = chord

from celery import group, chord

# Send task in paralell and receive result
tasks = group(get_air_quality.s('BG'), 
              get_air_quality.s('EN'), 
              get_air_quality.s('GB'))

ch = chord(tasks, send_result_via_email.s())

Retry with celery

from celery import Celery
import requests

app = Celery('tasks', backend='db+sqlite:///db.sqlite3')

@app.task(bind=True)
def get_air_quality(self, country):
    try:
        response = requests.get('https://maps.sensor.community/data/v2/data.dust.min.json')
    except  requests.exceptions.RequestException as exc:
        raise self.retry(exc=exc, countdown=60)

    values = []
    for row in response.json():
        if row['location']['country'] == country:
            values += [datavalue['value'] for datavalue in row['sensordatavalues']]

    return values

When do we use it?

  • With Django 💓
  • To offload heavy work form the web workers.
  • To communicate with 3rd party services.
  • To send emails!
  • To do IO heavy operations.

Priorities

>>> get_air_quality.apply_async(priority=2, args=['BG'])
>>> get_air_quality.apply_async(priority=4, args=['BG'])
>>> get_air_quality.apply_async(priority=2, args=['BG'])
>>> get_air_quality.apply_async(priority=10, args=['BG'])

Celery Beat

from celery import Celery
from celery.schedules import crontab
import requests

app = Celery('tasks', backend='db+sqlite:///db.sqlite3')

@app.task(bind=True)
def get_air_quality(self, country):
    try:
        response = requests.get('https://maps.sensor.community/data/v2/data.dust.min.json')
    except  requests.exceptions.RequestException as exc:
        raise self.retry(exc=exc, countdown=60)

    values = []
    for row in response.json():
        if row['location']['country'] == country:
            values += [datavalue['value'] for datavalue in row['sensordatavalues']]

    return values

app.conf.beat_schedule = {
    # Executes at sunset in Melbourne
    'get_air_quality': {
        'task': 'tasks.get_air_quality',
        'schedule': crontab(hour=7, minute=30, day_of_week=1),
        'args': ('BG'),
    },
}

Celery Beat

$ celery -A tasks beat

Advanced Architecture
with Celery Beat

Producer

Broker with Queue

Consumer

Beat

Text

QnA

Celery Basics

By Hack Bulgaria

Celery Basics

Just a deck for a dev.bg talk.

  • 925