Celery

Get your task together!

By Sep Ehr

https://github.com/seperman

http://zepworks.com

Why Distributed Task Queues?

  • Offload long jobs to background processes
    example: video conversion
  • Offload too many [small] jobs to background processes
    example: commenting system 
  • Keep track of jobs, monitor, auto-restart
  • Schedule jobs
    Instead of cron jobs

Message Queue vs. Task Queue

Message Queue are the basic functionality of passing, holding, and delivering messages 

Example: Redis, RabbitMQ

Tasks Queue manage work to be done and is considered a type of message queue

Example: Celery

Distributed Task Queus

in Python

Distributed Task Queues in Python

Solution Stars on git Downloads/mo
Celery 4600 400,000
RQ (Redis Queue) 2600 40,000
⤿ Django RQ 428 13,000
Huey 824 3,000
MrQ (Mr. Q) 340 5,000
Taskmaster 346 1,000

Popularity

Distributed Task Queues in Python

Solution Stars on github
Celery Instagram, Mozilla, Truecar
RQ (Redis Queue) ?
⤿ Django RQ ?
Huey ?
MrQ (Mr. Q) Pricing Assistant (creator)
Taskmaster Disqus (creator)

Who uses them? Everyone.

Celery Architecture

Image from parallel programming in Python book

RQ Architecture

Client ⥂ Redis ⥄ Worker

Celery - Example

from celery import Celery
app = Celery('tasks', broker='amqp://guest@localhost//')
@app.task
def add(x, y):
    import time; sleep(5*60)
    return x + y

------------------

>>> from tasks import add
>>> result = add.delay(4, 4)
>>> result.ready()
False
5 minutes later:
>>> result.ready()
True

RQ - Example

from rq import Queue
from redis import Redis
from somewhere import count_words_at_url

# Tell RQ what Redis connection to use
redis_conn = Redis()
# no args implies the default queue
q = Queue(connection=redis_conn)

# Delay execution of count_words_at_url('http://nvie.com')
job = q.enqueue(count_words_at_url, 'http://nvie.com')
print job.result   # => None

# Now, wait a while, until the worker is finished
time.sleep(2)
print job.result   # => 889

Monitoring

Monitoring

  • Celery - Flower
  • RQ - Dashboard

Celery Monitoring - Flower

Celery Monitoring - Flower

Celery Monitoring - Flower

RQ Monitoring - RQ Dashboard

RQ Dashboard

RQ Monitoring - RQ Dashboard

RQ Monitoring - RQ Dashboard

MRQ Monitoring - MRQ Dashboard

MRQ dashboard

MRQ Monitoring - MRQ Dashboard

MRQ Monitoring - MRQ Dashboard

Celery Example

@shared_task
def test_progressbar(user_id=1):
    from time import sleep
    from django.utils.safestring import mark_safe

    with celery_progressbar_stat(current_task, user_id) as c_stat:

        for i in range(0, 101):

            if c_stat.is_killed:
                c_stat.report("Terminating task", e="test_err3", fatal=True)

            sleep(.3)
            if i == 6:
                logger.info("test progress bar at 6%")
                c_stat.report("Error: This error should show up", e="test_err",
                sticky_msg=mark_safe("<p>TEST STICKY ERROR.</p>
                <img src='https://someimage.jpeg'>"))

            if i == 16:
                logger.info("test progress bar at 16%")
                c_stat.report("Error again: This error should show up too",
                e="test_err2")

            if i == 22:
                logger.info("test progress bar at 22%")
                c_stat.report("Error: This error should show NOT up since
                               it is raised before", e="test_err2")

            c_stat.percent = i

Celery vs. RQ - overview

Celery RQ
Complexity of code Very complicated Easy to understand
Documentation Take a while to read Simple
Monitoring Flower RQ Dashboard
message brokers RabbitMQ, Redis, MongoDB Redis
result backends RabbitMQ, Redis, Memcached, MongoDB, Cassandra,... Redis
Concurrency Master-Slave processes supervisord() + fork
Scheduler Celerybeats 3rd party
Language Can send tasks from one language to another Only Python
Subtasks Can create tasks within tasks Nope
Django support Built-in Django-rq

Celery vs. RQ - why RQ

Why RQ?

It all comes down to simplicity.

and...

Memory Leak

Memory Leak

  • Celery has Memory Leak issues
  • Some memory leak can happen with older broker libraries. i.e. librabbitmq
  • Celery monitor (Flower) has huge memory leak.
  • RQ offers less but its memory leak should be much smaller than Celery (not verified it myself)

Why Celery?

  • RQ limits you to use Redis both as message broker and result backend. If you need another broker/result backend.
  • Redis can drop messages. But it will pick it up later. If that bothers you, you can't use RQ.
  • Celery is way more feature rich and flexible than RQ.
  • If you don't ever really need to know the magic behind the scene in Celery.

Use Celery when:

How complex?

RQ graph

How complex?

Celery graph

How complex?

And for your pleasure...

How complex?

Django graph

The End

Thank You

Celery

Get your task together!

By Sep Ehr

https://github.com/seperman

http://zepworks.com

https://www.linkedin.com/in/sepehr

Task Queue

More info

https://www.djangopackages.com/grids/g/workers-queues-tasks/

http://stackoverflow.com/questions/13440875/pros-and-cons-to-use-celery-vs-rq

 

Celery: get your task together

By seperman

Celery: get your task together

  • 184