Fun With Celery

Django Bulgaria Meetup

10.11.2016 @ Co-Share

Radoslav Georgiev

HackSoft, HackBulgaria

github.com/RadoRado

@Rado_g

Celery

  • Async task queue & task runner

  • Distributed message passing

  • Implemented in Python

  • Plays well with Django

  • Pluggable brokers & storages

Message Broker

  • Used for sending and receiving messages

  • A separate service, usually RabbitMQ

  • Also supports Redis, Amazon SQS, Zookeeper

  • The minimum thing you need to run Celery

Result Backend

  • Used for keeping task states & results

  • Can be Redis, SQL, Django & others.

  • Important to have result backend if we want to do more complicated things.

Django

Celery Client

some_task.delay()

Broker

Msg passing

Msg passing

Celery Workers

psql

Result backend

Django ORM!

When do we need Celery? (Django context)

When we don't want to block the HTTP response & make the user wait.

def some_view(request, *args, **kwargs):
    # Some view-related stuff
    send_notification_email(request.user)
    # Some view-related stuff
$ pip install django
$ pip install celery
$ sudo apt-get install rabbitmq-server
$ django-admin startproject funwithcelery

Setup time.

Setup time.

.
├── db.sqlite3
├── funwithcelery
│   ├── celery.py
│   ├── __init__.py
│   ├── settings.py
│   ├── urls.py
│   └── wsgi.py
├── manage.py
├── requirements.txt
└── tasks
    ├── apps.py
    ├── __init__.py
    └── tasks.py

celery.py

from __future__ import absolute_import
import os
from celery import Celery


os.environ.setdefault('DJANGO_SETTINGS_MODULE',
                      'funwithcelery.settings')

app = Celery('funwithcelery')

app.config_from_object('django.conf:settings',
                       namespace='CELERY')
app.autodiscover_tasks()

funwithcelery/__init__.py

from __future__ import absolute_import

from .celery import app as celery_app

__all__ = [celery_app]

funwithcelery/settings.py

# Rest of settings

CELERY_BROKER_URL = 'amqp://guest@localhost//'
CELERY_RESULT_BACKEND = 'django-db'
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_ACCEPT_CONTENT = ['json']

Django for result backend

pip install djcelery

The old & somehow frustrating way.

# djcelery/models.py
# ALL_STATES is set.
TASK_STATE_CHOICES = zip(states.ALL_STATES,
                         states.ALL_STATES)
# ...
@python_2_unicode_compatible
class TaskMeta(models.Model):
    # ...
    status = models.CharField(
        _('state'),
        max_length=50,
        default=states.PENDING,
        choices=TASK_STATE_CHOICES,
    )
    # ...

djcelery

django-celery-results

pip install django-celery-results
INSTALLED_APPS += ['django_celery_results']
python manage.py migrate

Starting everything

$ celery -A funwithcelery worker --loglevel=info
# tasks/tasks.py

from celery import shared_task

@shared_task
def start_here(*args, **kwargs):
    print(args)
    print(kwargs)

    return 42

Starting everything

$ python manage.py shell
>>> from tasks.tasks import start_here
>>> result = start_here.delay()
>>> result
<AsyncResult: 3f20a834-c51d-46cd-9f2e-e60e2df46de9>
>>> result.status
'SUCCESS'
>>> result.result
42

Demos for calling tasks

Subtasks

Where the fun begins.

@shared_task
def sum(a, b):
    return a + b
>>> sum.delay(1, 2).get()
3
>>> sum.s()
tasks.tasks.sum()
>>> type(sum.s())
<class 'celery.canvas.Signature'>
>>> sum.s().delay(1, 2).get()
3

Subtasks

Where the fun begins.

sum.s()  """ Is serializable """
sum.s(1) """ Supports partial application """
""" Supports task-argument passing """
sum.s(1).set(countdown=1)
""" Full example """
sum.s(1).set(countdown=1).delay(2).get() == 3

Call t2 after t1 is finished

from celery import chain
from tasks.tasks import start_here, sum

tasks = chain(start_here.s(), sum.s(1, 2))
tasks()

chains

Call t2 after t1 is finished

TypeError: sum() takes 2 positional arguments but 3 were given

chains

Chain passes the result of t1 as an argument to t2. t2 to t3 and so on ...

Call t2 after t1 is finished

>>> tasks = chain(start_here.s(1, 2, 3, a=5), start_here.s())
>>> tasks()

chains

[INFO/MainProcess] Received task: tasks.tasks.start_here[2d2af763-1c5a-4e12-b3f8-4cb8b8159a4b]  
[WARNING/PoolWorker-2] (1, 2, 3)
[WARNING/PoolWorker-2] {'a': 5}
[INFO/MainProcess] Received task: tasks.tasks.start_here[c5c5140b-966e-47d6-920d-b13fe94943c6]  
[INFO/PoolWorker-2] Task tasks.tasks.start_here[2d2af763-1c5a-4e12-b3f8-4cb8b8159a4b] succeeded in 0.16101670899661258s: 42
[WARNING/PoolWorker-3] (42,)
[WARNING/PoolWorker-3] {}

Immutable signatures

>>> tasks = chain(start_here.s(1, 2, 3), sum.si(1, 2))
>>> tasks()
>>> tasks = chain(sum.s(1, 2), sum.s(3), sum.s(4), start_here.s())
>>> tasks()
>>> tasks = chain(sum.s(1, 2), sum.s(3), sum.s(4), start_here.si())
>>> tasks()

Calling tasks in parallel

groups

>>> tasks = group(sleepy_task.s(1), 
                  sleepy_task.s(2), 
                  sleepy_task.s(3), 
                  sum.s(1, 2))
>>> res = tasks().get()
>>> res
[42, 42, 42, 3]

Group + chain = chord

chord(groups, callback)

tasks = group(sleepy_task.s(1),
              sleepy_task.s(2),
              sleepy_task.s(3),
              sum.s(1, 2))
ch = chord(tasks, start_here.s())
ch()
"""
[WARNING/PoolWorker-1] ([42, 42, 42, 3],)
[WARNING/PoolWorker-1] {}
"""

Group + chain = chord

>>> group_tasks = group(sum.s(1, 2), sum.s(3, 4), sum.s(11, 12))
>>> chain_tasks = chain(start_here.s(), start_here.s())
>>> ch = chord(group_tasks, chain_tasks)
>>> ch()
"""
[WARNING/PoolWorker-1] ([3, 7, 23],)
[WARNING/PoolWorker-1] {}
...
[WARNING/PoolWorker-4] (42,)
[WARNING/PoolWorker-4] {}
"""
  • chain

  • group

  • chord

  • map

  • startmap

  • chunk

Django Testing

"""
* No need for running RabbitMQ
* Don't do async (always eager)
* Propagate exceptions
"""
CELERY_BROKER_BACKEND = 'memory'
CELERY_ALWAYS_EAGER = True
CELERY_EAGER_PROPAGATES_EXCEPTIONS = True

Celery topics not covered

Examples

@shared_task(max_retries=settings.CELERY_TASK_MAX_RETRIES)
def prepare_for_grading(run_id):
    pending_task = get_pending_task(run_id)

    if pending_task is None:
        return "No tasks to run right now."

    preparator = PreparatorFactory.get(pending_task)
    test_runs = preparator.prepare()

    for data in test_runs:
        grade = grade_pending_run.s(**data).set(countdown=1)
        clean = clean_up_after_run.s()
        ch = chain(grade, clean)
        ch()

Examples

@shared_task(bind=True, max_retries=settings.CELERY_TASK_MAX_RETRIES)
def send_template_mail(self, 
                       template_name,
                       recipients,
                       context,
                       **kwargs):
    api_key = get_mandrill_api_key()
    client = mandrill.Mandrill(api_key)

    message = build_message(recipients, context)

    try:
        result = client.messages.send_template(template_name, 
                                               [], 
                                               message)
        return result
    except SoftTimeLimitExceeded as e:
        self.retry(exc=e)
    except mandrill.Error as e:
        self.retry(exc=e)

10x!

Fun With Celery

By Hack Bulgaria

Fun With Celery

  • 2,331