0

我遇到了问题并将其范围缩小到 Apache。我在 SSL 上的 mod_wsgi(worker mpm,守护进程模式)上运行 Django w/Tastypie。我正在通过另一台服务器上的 htaccess 代理调用 API,以避免 ajax 跨域访问错误。

一切都运行得很好。但是,在调用 api 时,我收到了一个非常随机的延迟,通过我们的用户界面点击特定项目。似乎网络服务器有一个非常一致的不一致。延迟总是 7 秒。它每 5-15 分钟随机发生一次。

这是我的 Apache 设置:

<IfModule mpm_worker_module>
    StartServers         25
    MinSpareThreads      25
    MaxSpareThreads      75
    ThreadLimit          64
    ThreadsPerChild      25
    MaxClients          150
    MaxRequestsPerChild   0
    MaxMemFree         1024
</IfModule>

在我的虚拟主机中:

    WSGIDaemonProcess www.domain.com processes=4 threads=1
    WSGIProcessGroup www.domain.com
    WSGIScriptAlias / /var/www/domain/wsgi.py process-group=www.domain.com application-group=%{GLOBAL}
    WSGIPassAuthorization On

通过 Django 提供的所有请求都是 JSON 格式(纯 API)。

任何帮助,将不胜感激。

谢谢,马克

更新:我怀疑它实际上很可能是 Apache 问题而不是 DNS 问题。看起来它正在创建额外的进程来在实际响应任何内容之前提供请求。

172.31.4.91 - - [03/Aug/2013:19:01:29 -0700] "GET /api/v1/clock/?limit=1 HTTP/1.1" 200 5159 "https://www.domain.com/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:20.0) Gecko/20100101 Firefox/20.0"
172.31.4.91 - - [03/Aug/2013:19:01:29 -0700] "-" 408 142 "-" "-"
172.31.4.91 - - [03/Aug/2013:19:01:29 -0700] "-" 408 142 "-" "-"
172.31.4.91 - - [03/Aug/2013:19:01:30 -0700] "-" 408 142 "-" "-"
172.31.4.91 - - [03/Aug/2013:19:01:29 -0700] "PATCH /api/v1/check/546d48e9-f15f-4dee-8742-864d1fc5e0f7/ HTTP/1.1" 202 7334 "https://www.domain.com/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:20.0) Gecko/20100101 Firefox/20.0"
172.31.4.91 - - [03/Aug/2013:19:01:32 -0700] "-" 408 142 "-" "-"
172.31.4.91 - - [03/Aug/2013:19:01:32 -0700] "-" 408 142 "-" "-"
172.31.4.91 - - [03/Aug/2013:19:01:33 -0700] "-" 408 142 "-" "-"
172.31.4.91 - - [03/Aug/2013:19:01:36 -0700] "-" 408 142 "-" "-"
172.31.4.91 - - [03/Aug/2013:19:01:36 -0700] "-" 408 142 "-" "-"
172.31.4.91 - - [03/Aug/2013:19:01:36 -0700] "-" 408 142 "-" "-"
172.31.4.91 - - [03/Aug/2013:19:01:36 -0700] "-" 408 142 "-" "-"
172.31.4.91 - - [03/Aug/2013:19:01:36 -0700] "-" 408 142 "-" "-"
172.31.4.91 - - [03/Aug/2013:19:01:36 -0700] "-" 408 142 "-" "-"
172.31.4.91 - - [03/Aug/2013:19:01:37 -0700] "-" 408 142 "-" "-"
172.31.4.91 - - [03/Aug/2013:19:01:37 -0700] "-" 408 142 "-" "-"
172.31.4.91 - - [03/Aug/2013:19:01:38 -0700] "POST /api/v1/check/ HTTP/1.1" 201 1492 "https://www.domain.com/" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:20.0) Gecko/20100101 Firefox/20.0"

请注意补丁和帖子之间的 8 秒延迟。空白行是什么意思?

172.31.4.91 - - [03/Aug/2013:19:01:36 -0700] "-" 408 142 "-" "-"

更新:这是我收到延迟 7 秒的“信息”日志中的一点...

[Sun Aug 04 13:14:15 2013] [info] [client 172.31.28.237] (70007)The timeout specified has expired: SSL input filter read failed.
[Sun Aug 04 13:14:15 2013] [info] [client 172.31.28.237] Connection closed to child 81 with standard shutdown (server api.chanj.com:443)
[Sun Aug 04 13:14:16 2013] [info] Initial (No.1) HTTPS request received for child 136 (server api.chanj.com:443)
[Sun Aug 04 13:14:16 2013] [info] [client 172.31.28.237] Connection to child 147 established (server api.chanj.com:443)
[Sun Aug 04 13:14:16 2013] [info] Seeding PRNG with 656 bytes of entropy
[Sun Aug 04 13:14:16 2013] [info] [client 172.31.28.237] Connection closed to child 136 with standard shutdown (server api.chanj.com:443)
[Sun Aug 04 13:14:16 2013] [info] [client 172.31.28.237] Connection to child 79 established (server api.chanj.com:443)
[Sun Aug 04 13:14:16 2013] [info] Seeding PRNG with 656 bytes of entropy
[Sun Aug 04 13:14:16 2013] [info] Initial (No.1) HTTPS request received for child 79 (server api.chanj.com:443)
[Sun Aug 04 13:14:16 2013] [info] [client 172.31.28.237] Connection to child 17 established (server api.chanj.com:443)
[Sun Aug 04 13:14:16 2013] [info] Seeding PRNG with 656 bytes of entropy
[Sun Aug 04 13:14:17 2013] [info] [client 172.31.28.237] Connection closed to child 79 with standard shutdown (server api.chanj.com:443)
[Sun Aug 04 13:14:17 2013] [info] [client 172.31.28.237] Connection to child 67 established (server api.chanj.com:443)
[Sun Aug 04 13:14:17 2013] [info] Seeding PRNG with 656 bytes of entropy
[Sun Aug 04 13:14:21 2013] [info] Initial (No.1) HTTPS request received for child 67 (server api.chanj.com:443)
[Sun Aug 04 13:14:21 2013] [info] [client 172.31.28.237] Connection to child 140 established (server api.chanj.com:443)
[Sun Aug 04 13:14:21 2013] [info] Seeding PRNG with 656 bytes of entropy
[Sun Aug 04 13:14:21 2013] [info] [client 172.31.28.237] Connection closed to child 67 with standard shutdown (server api.chanj.com:443)
[Sun Aug 04 13:14:21 2013] [info] [client 172.31.28.237] Connection to child 78 established (server api.chanj.com:443)
[Sun Aug 04 13:14:21 2013] [info] Seeding PRNG with 656 bytes of entropy
[Sun Aug 04 13:14:21 2013] [info] [client 172.31.28.237] (70007)The timeout specified has expired: SSL input filter read failed.
[Sun Aug 04 13:14:21 2013] [info] [client 172.31.28.237] Connection closed to child 144 with standard shutdown (server api.chanj.com:443)
[Sun Aug 04 13:14:26 2013] [info] Initial (No.1) HTTPS request received for child 78 (server api.chanj.com:443)
4

2 回答 2

0

如中所述:

您没有做任何事情来解决您的 WSGI 应用程序中处理请求的线程太少(总共 4 个),而 Apache 一次可以尝试代理多达 150 个。所以我不排除您的 WSGI 应用程序缺乏处理许多并发长时间运行请求的能力。换句话说,如果你偶尔会收到一些长时间运行的请求,那可能会导致其他请求的积压和延迟。

延迟的另一个潜在原因是进程重新启动时 WSGI 应用程序的加载时间。那时的这种延迟可能是因为预加载数据库信息或编译模板。

现在从技术上讲,您的配置不会因为最大请求、不活动等原因而重新启动进程,但是如果正在触摸 WSGI 脚本文件或者您正在运行自动代码重新加载器并且正在触摸其他代码,则可以重新启动进程。

为了确定是否与进程重启有关,您应该确保在 Apache 中将 LogLevel 设置为“info”,并监视 Apache 错误日志中有关进程重启和 WSGI 脚本文件加载的 mod_wsgi 日志。还要留意守护进程崩溃,因为这也可能是一个原因。特别是因为您没有使用:

WSGIApplicationGroup %{GLOBAL}

除此之外,真的很难推测。我建议您也许考虑使用监控工具。此处的选项是具有生产能力的系统,例如 New Relic,或者如果在开发环境中,您可以使用 Django 调试工具栏。它们可能有助于缩小延迟发生的位置以及 Web 请求的范围。

于 2013-08-04T13:49:32.893 回答
0

因此,出现问题的地方的网络路由器将 UDP 超时设置为 200 秒。就是这样。与 Apache 等无关。我确实将线程设置得更高,它有所帮助,但这不是滞后的原因......

于 2013-08-05T03:00:54.037 回答