linux - Nginx 和 php-fpm：无法摆脱 502 和 504 错误

Question

我有一个 ubuntu 服务器和一个负载很高的网站。服务器是：

专用于 nginx，使用 php-fpm（无 apache），mysql 位于不同的机器上
有 8 GB 的 RAM
每秒获取大约 2000 个请求。

每个 php-fpm 进程消耗大约 65MB 的 RAM，根据top命令：

最高命令

空闲内存：

admin@myserver:~$ free -m
             total       used       free     shared    buffers     cached
Mem:          7910       7156        753          0        284       2502
-/+ buffers/cache:       4369       3540
Swap:         8099          0       8099

问题

最近，我遇到了很大的性能问题。响应时间非常长，非常多Gateway Timeouts，而且在晚上，当负载变高时，90% 的用户只会看到“找不到服务器”而不是网站（我似乎无法重现这一点）

日志

我的 Nginx 错误日志充满了以下消息：

2012/07/18 20:36:48 [error] 3451#0: *241904 upstream prematurely closed connection while reading response header from upstream, client: 178.49.30.245, server: example.net, request: request: "GET /readarticle/121430 HTTP/1.1", upstream: "fastcgi://127.0.0.1:9001", host: "example.net", referrer: "http://example.net/articles"

我尝试切换到 unix 套接字，但仍然出现这些错误：

2012/07/18 19:27:30 [crit] 2275#0: *12334 connect() to unix:/tmp/fastcgi.sock failed (2: No such file or directory) while connecting to upstream, client: 84.
237.189.45, server: example.net, request: "GET /readarticle/121430 HTTP/1.1", upstream: "fastcgi://unix:/tmp/fastcgi.sock:", host: "example.net", referrer: "http
://example.net/articles"

php-fpm 日志中充满了这些：

[18-Jul-2012 19:23:34] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there  are 0 idle, and 75 total children

我试图将给定的参数增加到100，但似乎仍然不够。

配置

这是我目前的配置

php-fpm

listen = 127.0.0.1:9001
listen.backlog = 4096
pm = dynamic
pm.max_children = 130
pm.start_servers = 40
pm.min_spare_servers = 10
pm.max_spare_servers = 40
pm.max_requests = 100

nginx

worker_processes  4;
worker_rlimit_nofile 8192;
worker_priority 0;
worker_cpu_affinity 0001 0010 0100 1000;

error_log  /var/log/nginx_errors.log;

events {
    multi_accept off;
    worker_connections  4096;
}


http {
    include       mime.types;
    default_type  application/octet-stream;

    access_log off;
    sendfile        on;
    keepalive_timeout  65;
    gzip  on;

    # fastcgi parameters
    fastcgi_connect_timeout 120;
    fastcgi_send_timeout 180;
    fastcgi_read_timeout 1000;
    fastcgi_buffer_size 128k;
    fastcgi_buffers 4 256k;
    fastcgi_busy_buffers_size 256k;
    fastcgi_temp_file_write_size 256k;
    fastcgi_intercept_errors on;

    client_max_body_size 128M;

    server {
        server_name example.net;
        root /var/www/example/httpdocs;
        index index.php;
        charset utf-8;
        error_log /var/www/example/nginx_error.log;

        error_page 502 504 = /gateway_timeout.html;

        # rewrite rule
        location / {
            if (!-e $request_filename) {
                rewrite ^(.*)$ /index.php?path=$1 last;
            }
        }
        location ~* \.php {
            fastcgi_pass 127.0.0.1:9001;
            fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
            fastcgi_param PATH_INFO $fastcgi_script_name;
            include fastcgi_params;
        }
    }
}

我将非常感谢有关如何识别问题以及我可以调整哪些参数来解决此问题的任何建议。或者也许 8GB 的 RAM 不足以应付这种负载？

score 1 · Accepted Answer

若干问题。仍然值得在如此繁忙的网站上修复它们。MySQL 可能是目前的根本原因。但从长远来看，您需要做更多的工作。

缓存

我看到您的错误消息之一显示对上游 php 的获取请求。对于如此高流量的站点（您提到的 2000 r/s），这看起来不太好。此页面 (/readarticle/121430) 似乎是一个完全可缓存的页面。一方面，您可以使用 nginx 来缓存此类页面。查看fastcgi 缓存

GET /readarticle/121430

php-fpm

pm.max_requests = 100

该值表示处理 100 个请求后，一个进程将被 php-fpm master 杀死。php-fpm 使用该值来对抗 3rd 方内存泄漏。您的站点非常繁忙，速度为 2000r/s。您的最大子进程为 130 个，每个子进程最多只能处理 100 个请求。这意味着在 13000/2000 = 6.5 秒后，它们都将被回收。这太多了（每秒杀死 20 个进程）。您至少应该从值 1000 开始，只要您没有看到内存泄漏，就应该增加该数字。有人在生产中使用 10,000。

nginx.conf

问题一：

    if (!-e $request_filename) {
        rewrite ^(.*)$ /index.php?path=$1 last;
    }

应该用更高效的 try_files 代替：

    try_files $uri /index.php?path=$uri;

如果位置块和正则表达式重写规则匹配，您可以节省额外的费用。

问题 2：使用 unix socket 会比使用 ip 节省更多时间（根据我的经验，大约 10-20%）。这就是为什么 php-fpm 默认使用它。
问题 3：您可能有兴趣在 nginx 和 php-fpm 之间建立 keepalive 连接。在 nginx 官方网站中给出了一个示例。

score 1 · Accepted Answer

我需要查看您的 php.ini 设置，我认为这与 MySQL 无关，因为您遇到了看起来像的套接字错误。此外，这是在一段时间后开始发生的事情，还是在服务器重新启动时立即发生？

尝试重新启动 php5-fpm 守护进程，看看在跟踪错误日志时会发生什么。

检查您的 php.ini 文件以及通常位于 /etc/nginx/fastcgi_params 中的所有 fastcgi_params。您正在尝试做的事情有很多例子。

另外，您是否启用了 apc php 缓存扩展？

如果您在灯堆栈上，它在您的 php.ini 文件中将如下所示：

extension=apc.so
....
apc.enabled=0

从命令行进行一些 mysql 连接负载测试并查看结果可能不会有什么坏处。

score 1 · Accepted Answer

设置 nginx 微缓存也会有所帮助。这将在几秒钟内提供相同的响应。

http://seravo.fi/2013/optimizing-web-server-performance-with-nginx-and-php 有一些关于 nginx 性能的好信息。个人遵循这一点，我很高兴。

score 0 · Accepted Answer

为了得到这个问题的答案：

You should check your MySQL server. Probably it's overloaded or it limits count of parallel MySQL connections. You should find the bottleneck. And according to your top screenshot it doesn't look like either RAM or CPU, then it's most likely I/O.-@VBrat

您将来可能想做的事情：

1-增加你的内存大小。

2-使用缓存。请参阅这篇文章，了解缓存如何加速您的网站

3-减少执行的查询数量。

score 0 · Accepted Answer

为 PHP 设置 APC 扩展（检查/配置）
MySQL - 检查配置、索引、慢查询
安装和配置 Varnish。这可以缓存页面请求，并且在减少您需要进行的 php 请求和 mysql 查询的数量方面非常有用。使用 cookie/ssl 可能会很棘手，但除此之外并不太难，而且非常值得运行

linux - Nginx 和 php-fpm：无法摆脱 502 和 504 错误

5 回答 5

Related

Reference