3

这是一个使用 Python 2.6.6 和 Django 1.2.3 在 Linux(我认为是 CentOS)上运行的服务器。

发生的情况是运行 django 的 python 进程突然开始使用 100% cpu,直到它重新启动。这种情况最近只发生过两次,并且在不到一个月前才开始这样做。我在大约 7 个月内没有对代码进行任何大的更改。

查看控制台的输出,它没有任何极端使用。在我相信它开始使用 100% cpu 之前的大约 10 分钟内只有大约 10 个查询。打印的唯一错误是管道损坏错误,我认为一旦它变慢并关闭连接,可能有人在使用它。

我重新运行了所有在它变慢时的查询,它们都运行得很好,没有任何问题。

服务器本身在某种意义上仍然可以正常工作,但速度非常慢。我每天都会进行一系列测试,它们通常需要大约 7 分钟,但是当它如此缓慢时,可能需要 2-3 小时。

如果有人有任何想法,我将不胜感激。

另外,您可能会注意到,如果有人可以推荐有关如何监控此类活动的良好做法,那么当涉及到这些问题时,我还是个新手。

谢谢你的时间!

下面是我提到的输出,它开始 100% cpu 的时间是 ~4pm

[23/Jul/2012 15:49:55] "GET /CFXsearch/?n=&v=all&e=2012Week30&c=all&r=all&p=all&run=all&pl=all&m=all&o=all&d1=&d2=&submitOption=comparing&compareBy=plat&sort=sortName&Soft=CFX HTTP/1.1" 200 67228
[23/Jul/2012 15:50:00] "GET /CFXsearch/?n=&v=all&e=2012Week30&c=all&r=all&p=all&run=all&pl=RH5&m=all&o=all&d1=&d2=&submitOption=comparing&compareBy=plat&sort=sortName&Soft=CFX HTTP/1.1" 200 33346
[23/Jul/2012 15:50:05] "GET /CFXsearch/?n=&v=all&e=2012Week30&c=all&r=all&p=all&run=all&pl=SLES10&m=all&o=all&d1=&d2=&submitOption=comparing&compareBy=plat&sort=sortName&Soft=CFX HTTP/1.1" 200 33394
[23/Jul/2012 15:54:48] "GET /CFXsearch/?n=&v=all&e=2012Week30&c=all&r=all&p=all&run=all&pl=SLES11&m=all&o=all&d1=&d2=&submitOption=comparing&compareBy=plat&sort=sortName&Soft=CFX HTTP/1.1" 200 33394
[23/Jul/2012 15:54:53] "GET /results/?n=&e=2012Week30&c=TGTest&pl=SLES11&p=single&p=defined&p=double&run=default&run=hpmpi&run=mpich&run=mpich2&run=Platform&run=pvm%20parallel&run=serial&sort=sortResult&d=&y=&submitOption=latestsearch&Soft=CFX HTTP/1.1" 200 19350
[23/Jul/2012 15:54:57] "GET /results/?n=&e=2012Week30&c=turboexamples&pl=SLES11&p=single&p=defined&p=double&run=default&run=hpmpi&run=mpich&run=mpich2&run=Platform&run=pvm%20parallel&run=serial&sort=sortResult&d=&y=&submitOption=latestsearch&Soft=CFX HTTP/1.1" 200 36729
[23/Jul/2012 15:59:40] "GET / HTTP/1.1" 200 11111
[23/Jul/2012 15:59:40] "GET /site_media/style.css HTTP/1.1" 304 0
[23/Jul/2012 15:59:45] "GET /CFXsearch/ HTTP/1.1" 200 25637
[23/Jul/2012 15:59:45] "GET /site_media/jquery-1.2.6.min.js HTTP/1.1" 304 0
[23/Jul/2012 15:59:45] "GET /site_media/sorttable.js HTTP/1.1" 304 0
[23/Jul/2012 16:00:04] "GET /CFXsearch/?n=&v=14.5&e=all&c=solver54&r=all&p=all&run=all&pl=all&m=all&o=all&d1=&d2=&submitOption=comparing&compareBy=plat&sort=sortName&Soft=CFX HTTP/1.1" 200 402737
[23/Jul/2012 16:00:19] "GET /results/?n=&e=2012Week29&c=solver54&pl=SLES11&p=single&p=defined&p=double&run=default&run=hpmpi&run=mpich&run=mpich2&run=Platform&run=pvm%20parallel&run=serial&sort=sortResult&d=&y=&submitOption=latestsearch&Soft=CFX HTTP/1.1" 200 1557488
[23/Jul/2012 16:02:48] "GET /CFXsearch/?n=&v=14.5&e=all&c=solver54&r=all&p=all&run=all&pl=all&m=all&o=all&d1=&d2=&submitOption=comparing&compareBy=ex&sort=sortName&Soft=CFX HTTP/1.1" 200 408388
[23/Jul/2012 16:03:01] "GET /CFXsearch/?n=&v=14.5&e=all&c=solver54&r=all&p=all&run=all&pl=all&m=all&o=all&d1=&d2=&submitOption=comparing&compareBy=plat&sort=sortName&Soft=CFX HTTP/1.1" 200 402737
Traceback (most recent call last):
  File "/home/install2/testingDatabase/lib/python2.6/site-packages/django/core/servers/basehttp.py", line 281, in run
    self.finish_response()
  File "/home/install2/testingDatabase/lib/python2.6/site-packages/django/core/servers/basehttp.py", line 321, in finish_response
    self.write(data)
  File "/home/install2/testingDatabase/lib/python2.6/site-packages/django/core/servers/basehttp.py", line 400, in write
    self.send_headers()
  File "/home/install2/testingDatabase/lib/python2.6/site-packages/django/core/servers/basehttp.py", line 465, in send_headers
    self._write(str(self.headers))
  File "/home/install2/testingDatabase/Python-2.6.6/Lib/socket.py", line 318, in write
    self.flush()
  File "/home/install2/testingDatabase/Python-2.6.6/Lib/socket.py", line 297, in flush
    self._sock.sendall(buffer(data, write_offset, buffer_size))
error: [Errno 32] Broken pipe
Traceback (most recent call last):
  File "/home/install2/testingDatabase/Python-2.6.6/Lib/SocketServer.py", line 560, in process_request_thread
    self.finish_request(request, client_address)
  File "/home/install2/testingDatabase/Python-2.6.6/Lib/SocketServer.py", line 322, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/home/install2/testingDatabase/lib/python2.6/site-packages/django/core/servers/basehttp.py", line 562, in __init__
    BaseHTTPRequestHandler.__init__(self, *args, **kwargs)
  File "/home/install2/testingDatabase/Python-2.6.6/Lib/SocketServer.py", line 618, in __init__
    self.finish()
  File "/home/install2/testingDatabase/Python-2.6.6/Lib/SocketServer.py", line 661, in finish
    self.wfile.flush()
  File "/home/install2/testingDatabase/Python-2.6.6/Lib/socket.py", line 297, in flush
    self._sock.sendall(buffer(data, write_offset, buffer_size))
error: [Errno 32] Broken pipe
[23/Jul/2012 16:09:59] "GET /PolyflowSummary/ HTTP/1.1" 200 7561
[23/Jul/2012 16:17:42] "GET /?soft=CFX HTTP/1.1" 200 11112
[23/Jul/2012 16:17:44] "GET /?soft=CFX HTTP/1.1" 200 11112
[23/Jul/2012 16:18:06] "GET /site_media/style.css HTTP/1.1" 200 432
[23/Jul/2012 16:18:23] "GET /site_media/style.css HTTP/1.1" 200 432
[23/Jul/2012 16:18:23] "GET /site_media/favicon.ico HTTP/1.1" 200 1718
4

1 回答 1

3

您可以连接到进程并附加调试器。我以前做过这个,它非常有用。 我的完整笔记在这里,但精简版是:

  • 安装以便 gdb “理解” python

  • 使用gdb -p PIDPID来自ps或类似)连接

  • 在 gdb 中生成堆栈跟踪,您将确切地看到消耗 CPU 的位置。

原始信用 -显示来自正在运行的 Python 应用程序的堆栈跟踪(事实上,在输入所有这些之后,也许这是链接问题的欺骗?我想问题是不同的,即使答案是相同的......)

于 2012-07-25T14:03:53.400 回答