django - 芹菜守护进程的问题

Question

我们的 celery 守护进程非常不稳定。每当我们推送更改时，我们都会使用结构部署脚本来重新启动守护进程，但由于某种原因，这会导致大量问题。

每当运行部署脚本时，芹菜进程都会处于某种伪死状态。他们将（不幸地）仍然使用来自 rabbitmq 的任务，但他们实际上不会做任何事情。令人困惑的是，简短的检查表明在这种状态下一切似乎都“正常”，celeryctl status 显示一个节点在线和 ps aux | grep celery 显示 2 个正在运行的进程。

但是，尝试手动运行 /etc/init.d/celeryd stop 会导致以下错误：

start-stop-daemon: warning: failed to kill 30360: No such process

虽然在这种状态下尝试运行 celeryd start 似乎工作正常，但实际上什么也没做。解决此问题的唯一方法是手动终止正在运行的 celery 进程，然后重新启动它们。

有什么想法吗？我们也没有完整的确认，但我们认为问题也会在几天后自行发展（目前没有任何活动，这是一个测试服务器），没有部署。

score 5 · Accepted Answer

I can't say that I know what's ailing your setup, but I've always used supervisord to run celery -- maybe the issue has to do with upstart? Regardless, I've never experienced this with celery running on top of supervisord.

For good measure, here's a sample supervisor config for celery:

[program:celeryd]
directory=/path/to/project/
command=/path/to/project/venv/bin/python manage.py celeryd -l INFO
user=nobody
autostart=true
autorestart=true
startsecs=10
numprocs=1
stdout_logfile=/var/log/sites/foo/celeryd_stdout.log
stderr_logfile=/var/log/sites/foo/celeryd_stderr.log

; Need to wait for currently executing tasks to finish at shutdown.
; Increase this if you have very long running tasks.
stopwaitsecs = 600

Restarting celeryd in my fab script is then as simple as issuing a sudo supervisorctl restart celeryd.

django - 芹菜守护进程的问题

1 回答 1

Related

Reference