当 MainProcess 失去与代理的连接时,我们遇到了 Celery 的奇怪问题。Celery 生成以下错误,然后开始使用 100% CPU。工人们仍将继续正常工作。我可以看到 RabbitMQ 认为连接超时。更新到 celery 3 后,我们经常遇到这些错误。
我感觉它与非阻塞消息消耗有关,但我在理解代码方面并没有真正取得进展。
有什么方法可以更早地检测到这些或防止 celery 使用 100% CPU?
- 芹菜:3.0.4
- AMQP:1.0.10
- 兔子MQ:2.8.4
时间戳相隔 2 小时,因为 RabbitMQ 报告 GMT 和 celery 本地时间。
芹菜错误
[2013-05-09 18:20:20,204: ERROR/MainProcess] Consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 369, in start
self.consume_messages()
File "/usr/local/lib/python2.7/site-packages/celery/worker/consumer.py", line 450, in consume_messages
readers[fileno](fileno, event)
File "/usr/local/lib/python2.7/site-packages/kombu/connection.py", line 290, in drain_nowait
self.drain_events(timeout=0)
File "/usr/local/lib/python2.7/site-packages/kombu/connection.py", line 279, in drain_events
return self.transport.drain_events(self.connection, **kwargs)
File "/usr/local/lib/python2.7/site-packages/kombu/transport/pyamqp.py", line 91, in drain_events
return connection.drain_events(**kwargs)
File "/usr/local/lib/python2.7/site-packages/amqp/connection.py", line 266, in drain_events
chanmap, None, timeout=timeout,
File "/usr/local/lib/python2.7/site-packages/amqp/connection.py", line 328, in _wait_multiple
channel, method_sig, args, content = read_timeout(timeout)
File "/usr/local/lib/python2.7/site-packages/amqp/connection.py", line 299, in read_timeout
return self.method_reader.read_method()
File "/usr/local/lib/python2.7/site-packages/amqp/method_framing.py", line 187, in read_method
raise m
IOError: Socket closed
RabbitMQ 错误
=ERROR REPORT==== 9-May-2013::16:20:20 ===
closing AMQP connection <0.1813.0> (192.168.201.104:12809 -> 192.168.201.104:5672):
{timeout,running}