Django/Celery Quickstart (or, how I learned to stop using cron and love celery)

Websites often need tasks that run periodically, behind the scenes. Examples include sending email reminders, aggregating denormalized data and permanently deleting archived records. Very often the simplest solution is to setup a cron job to hit a URL on the site that performs the task.

Cron has the advantage of simplicity, but it's not not ideal for the job. You have to take steps to ensure that regular users of the site cannot hit those URLs directly. It also forces you to manage an external configuration. What if you forget to perform the configuration on the qa or production servers? It would be safer and easier if the configuration was in the code for the site.

For Django sites, celery seems to be the solution of choice. Celery is really focused on being a distributed task queue, but it can also be a great scheduler. Their documentation is excellent, but I found that they lack a quickstart guide for getting started with Django and celery, just for replacing cron.

Note: Celery typically runs with RabbitMQ as the back-end. For just task scheduling, this may be overkill. This guide starts out using kombu, which is backed by the database Django is already using.

  1. Install django-celery, ghettoq
    sudo pip install django-celery
  2. Edit, and add the celery config info
    BROKER_URL = "django://" # tell kombu to use the Django database as the message queue
    import djcelery
  3. Add the new tables to the Django database
    ./ syncdb
  4. Create a file, in your project (same level as
    from celery.task.schedules import crontab
    from celery.decorators import periodic_task
    # this will run every minute, see
    @periodic_task(run_every=crontab(hour="*", minute="*", day_of_week="*"))
    def test():
        print "firing test task"
  5. Start the celery daemon in "beat" mode, which is required for scheduling
    sudo ./ celeryd -v 2 -B -s celery -E -l INFO

At this point, you should see your celery tasks in the console output, and you should see the task firing every minute.

[2012-03-02 09:34:49,170: WARNING/MainProcess]

 -------------- celery@chase-VirtualBox v2.5.1
---- **** -----
--- * ***  * -- [Configuration]
-- * - **** ---   . broker:      django://localhost//
- ** ----------   . loader:      djcelery.loaders.DjangoLoader
- ** ----------   . logfile:     [stderr]@INFO
- ** ----------   . concurrency: 1
- ** ----------   . events:      ON
- *** --- * ---   . beat:        ON
-- ******* ----
--- ***** ----- [Queues]
 --------------   . celery:      exchange:celery (direct) binding:celery

  . myapp.tasks.test

[2012-03-02 09:34:49,236: INFO/PoolWorker-2] child process calling
[2012-03-02 09:34:49,239: WARNING/MainProcess] celery@chase-VirtualBox has started.
[2012-03-02 09:34:49,245: INFO/Beat] child process calling
[2012-03-02 09:34:49,249: INFO/Beat] Celerybeat: Starting...
[2012-03-02 09:34:49,283: INFO/Beat] Scheduler: Sending due task myapp.tasks.test
[2012-03-02 09:34:54,654: INFO/MainProcess] Got task from broker: myapp.tasks.test[39d57f82-fdd2-406a-ad5f-50b0e30a6492]
[2012-03-02 09:34:54,666: WARNING/PoolWorker-2] firing test task
[2012-03-02 09:34:54,667: INFO/MainProcess] Task myapp.tasks.test[39d57f82-fdd2-406a-ad5f-50b0e30a6492] succeeded in 0.00423407554626s: None

If you want, you can upgrade to RabbitMQ. Just make sure to update your, as well.

You may also want to run celeryd as a service.

Update 3/1/2012: updated instructions Kombu. Tested on Python 2.7.2 and Django 1.3.0 in a clean environment.

I'm currently working at NerdWallet, a startup in San Francisco trying to bring clarity to all of life's financial decisions. We're hiring like crazy. Hit me up on Twitter, I would love to talk.

Follow @chase_seibert on Twitter