0

我正在寻找一种使用 Django ORM 的守护进程进行异步数据处理的方法。然而,ORM 不是线程安全的。尝试从线程中检索/修改 django 对象不是线程安全的。所以我想知道实现异步的正确方法是什么?

基本上我需要完成的是获取数据库中的用户列表,查询第三方 api,然后为这些用户更新用户配置文件行。作为守护进程或后台进程。为每个用户按顺序执行此操作很容易,但完全可扩展需要很长时间。如果守护进程是通过ORM检索和更新用户,如何实现一次处理10-20个用户?我会为此使用标准线程/队列系统,但您不能像线程交互

models.User.objects.get(id=foo) ...

Django 本身是一个异步处理系统,它为每个请求进行异步 ORM 调用(?),所以应该有办法做到这一点?到目前为止,我还没有在文档中找到任何内容。

干杯

4

2 回答 2

3

看看芹菜。我想这会解决你的问题。它使用多处理模块。它需要(非常)少的设置,但在扩展方面有很大帮助。

于 2010-05-11T07:20:05.507 回答
2

If your asynchronous processing is being done in its own process, then thread safety is not an issue because your threads are not sharing an address space, so they can't interfere with each other. They would each have their own copy of model objects. Concurrency will be controlled by the database with transactions. So your fine.

If your going to spawn a thread inside one of the web server's processes to do your asynchronous business, then you need to lock all API calls that are not thread safe.

from threading import Lock

Apache uses multiple processes via the fork() system call to handle conncurrent web requests. This is why Django's ORM APIs don't need to be thread safe. I believe Apache may be able to use threads instead of processes, but it think that feature has to be disabled in order to use Django.

http://groups.google.com/group/django-developers/browse_thread/thread/905f79e350525c95

Btw, do you understand the difference between a thread and a process? Its kind of important.

于 2010-05-11T06:32:49.460 回答