Background task processing and deferred execution in Django

本文介绍如何使用Celery和RabbitMQ实现Python项目的后台任务处理。通过具体配置和示例代码展示了Celery的安装、配置及与Django项目的集成过程。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Or, Celery + RabbitMQ = Django awesomeness!

As you know, Django is synchronous, or blocking. This means each request will not be returned until all processing (e.g., of a view) is complete. It's the expected behavior and usually required in web applications, but there are times when you need tasks to run in the background (immediately, deferred, or periodically) without blocking.

Some common use cases:

* Give the impression of a really snappy web application by finishing a request as soon as possible, even though a task is running in the background, then update the page incrementally using AJAX.
* Executing tasks asynchronously and using retries to make sure they are completed successfully.
* Scheduling periodic tasks.
* Parallel execution (to some degree).

There have been multiple requests to add asynchronous support to Django, namely via the python threading module, and even the multiprocessing module released in Python2.6, but I doubt it will happen any time soon, actually I doubt it will ever happen.

This is a common problem for many, and after scouring over many forum posts the following proposed solution keeps popping up, which reminds of me of the saying "when all you have is a hammer, everything looks like a nail".

* Create a table in the database to store tasks.
* Setup a cron job to trigger processing of said tasks.
* Bonus: Create an API for task management and monitoring.

Well, you can do it like that, but it usually leads to ugly, coupled code, which can become very complex over time, not very flexible, doesn't scale well, and generally a bad idea.

In my opinion, it ultimately comes down to seperation of concerns. I recently fell in love with the message queuing world (AMQP), in particular RabbitMQ, which can be used as an integral part of a really elegant solution for this issue, especially when coupled with Celery.

* Define a task.
* Send it to a processing queue.
* Let other code handle the processing.


What is Celery

Celery is a task queue system based on distributed message passing. Originally developed for Django, it can now be used in any Python project.

It's focused on real-time operation, but supports scheduling as well. The execution units, called tasks, are executed concurrently on a single (or multiple) worker server. Tasks can execute asynchronously (in the background) or synchronously (wait until ready).

Celery provides a powerful and flexible interface to defining, executing, managing and monitoring tasks. If you have a use-case, chances are you can do it with Celery.

Installation and configuration


Install Celery

One of Celery's dependencies is the multiprocessing module released in Python2.6. If you have an earlier version, such as Python2.5, you're in luck as the module has been backported.

When installing the backported module, it will need to be compiled, so lets install the required support.

apt-get install gcc python-dev

Now we are ready to install celery, lets install a few more dependencies and let easy_install take care of the rest.

apt-get install python-setuptools python-simplejson
easy_install celery


Install RabbitMQ

Celery's recommended message broker is RabbitMQ.

RabbitMQ is a complete and highly reliable enterprise messaging system based on the emerging AMQP standard. It is based on a proven platform, offers exceptionally high reliability, availability and scalability.

In the below example, I will download and install the latest release (at time of writing), but you should check their download page for newer versions and/or support for your platform.

Note: Installation will fail if there are missing dependencies. Because of this, we use the --fix-broken workaround.

wget http://www.rabbitmq.com/releases/rabbitmq-server/v1.7.2/rabbitmq-server_...
dpkg -i rabbit-server_1.7.2-1_all.deb
apt-get --fix-broken install

The default installation includes a guest user with the password of guest. Don't be fooled by the wording of the account, guest has full permissions on the default virtual host called /.

We will use the default configuration below, but you are encouraged to tweak your setup.

Configure Django project to use Celery/RabbitMQ

Add the following to settings.py

BROKER_HOST = "127.0.0.1"
BROKER_PORT = 5672
BROKER_VHOST = "/"
BROKER_USER = "guest"
BROKER_PASSWORD = "guest"

INSTALLED_APPS = (
...
'celery',
)

Synchronize the database

python manage.py syncdb


Sample code

Now that everything is installed and configured, here is some sample code to get you started. But, I recommend taking a look at the Celery documentation to get acquainted with its power and flexibility.

fooapp/tasks.py

from celery.task import Task
from celery.registry import tasks

class MyTask(Task):
def run(self, some_arg, **kwargs):
logger = self.get_logger(**kwargs)
...
logger.info("Did something: %s" % some_arg)

tasks.register(MyTask)

fooapp/views.py

from fooapp.tasks import MyTask

def foo(request):
MyTask.delay(some_arg="foo")
...

Now start the daemon and test your code.

python manage.py celeryd -l INFO


For convenience, there is a shortcut decorator @task which makes simple tasks that much cleaner.

A note on state: Since Celery is a distributed system, you can't know in which process, or even on what machine the task will run. So you shouldn't pass Django model objects as arguments to tasks, its almost always better to re-fetch the object from the database instead, as there are possible race conditions involved.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值