php - PHP FPM 7.1 套接字泄漏导致 NGINX - 504 网关超时

Question

我使用Laravel Forge来启动我的 EC2 环境，这为我构建了一个 LEMP 堆栈。我最近开始收到 504 请求超时。

我不是系统管理员（因此订阅了 Forge），但我查看了日志并将问题缩小到日志中的以下 2 个重复条目：

在：/var/log/nginx/default-error.log

2017/09/15 09:32:17 [error] 2308#2308: *1 upstream timed out (110: Connection timed out) while sending request to upstream, client: x.x.x.x, server: xxxx.com, request: "POST /upload HTTP/2.0", upstream: "fastcgi://unix:/var/run/php/php7.1-fpm.sock", host: "xxxx.com", referrer: "https://xxxx.com/rest/of/the/path"

在：/var/log/php7.1-fpm-log

[15-Sep-2017 09:35:09] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 0 idle, and 14 total children

似乎 fpm 打开了永不中断的连接，并且从我的 RDS 加载日志中，我可以看到 RAM 不断被最大化。

我试过了：

回滚到我的应用程序的明确稳定版本（2 个月前）
用 5.6、7.0 和 7.1重新安装我的EC2fpm （以及它们各自的）
在 14.04 和 16.04 完成上述所有操作
创建更大的 RDS

现在唯一可行的是强大的 RDS（8gb RAM）+ 每 300 个请求杀死 fpm 池连接。但显然在这个问题上投入资源并不是解决方案。

这是我的配置/etc/php/7.1/fpm/pool.d/www.conf

user = forge
group = forge
listen = /run/php/php7.1-fpm.sock
listen.owner = www-data
listen.group = www-data
listen.mode = 0666
pm = dynamic
pm.max_children = 30
pm.start_servers = 7
pm.min_spare_servers = 6
pm.max_spare_servers = 10
pm.process_idle_timeout = 7s;
pm.max_requests = 300

这是我的配置nginx.conf

listen 80;
listen [::]:80;
listen 443 ssl http2;
listen [::]:443 ssl http2;
server_name xxxx.com;
root /home/forge/xxxx.com/public;

# FORGE SSL (DO NOT REMOVE!)
ssl_certificate /etc/nginx/ssl/xxxx.com/111111/server.crt;
ssl_certificate_key /etc/nginx/ssl/xxxx.com/111111/server.key;

ssl_protocols xxxx;
ssl_ciphers ...;
ssl_prefer_server_ciphers on;
ssl_dhparam /etc/nginx/dhparams.pem;

add_header X-Frame-Options "SAMEORIGIN";
add_header X-XSS-Protection "1; mode=block";
add_header X-Content-Type-Options "nosniff";

index index.html index.htm index.php;

charset utf-8;

# FORGE CONFIG (DOT NOT REMOVE!)
include forge-conf/xxxx.com/server/*;

location / {
    try_files $uri $uri/ /index.php?$query_string;
}

location = /favicon.ico 
location = /robots.txt  

access_log /var/log/nginx/xxxx.com-access.log;
error_log  /var/log/nginx/xxxx.com-error.log error;

error_page 404 /index.php;

location ~ \.php$ {
    fastcgi_split_path_info ^(.+\.php)(/.+)$;
    fastcgi_pass unix:/var/run/php/php7.1-fpm.sock;
    fastcgi_index index.php;
    fastcgi_read_timeout 60;
    include fastcgi_params;
}

location ~ /\.(?!well-known).* {
    deny all;
}

location ~* \.(?:ico|css|js|gif|jpe?g|png)$ {
    expires 30d;
    add_header Pragma public;
    add_header Cache-Control "public";
}

score 2 · Accepted Answer

好的，经过大量调试和测试后，我注意到了这几个原因。

我的主要原因：我使用的 AWS RDS 实例MySQL有 500Mb 的内存。回顾过去，所有这些问题都是在数据库大小超过 400Mb 时开始的。
- 解决方案：确保您始终拥有数据库大小的 2 倍 RAM。否则整个 B+Tree 不适合内存，所以它必须进行不断的交换。这可能会使您的查询时间超过 15 秒。
此类问题的主要原因：未优化 SQL 查询。
- 解决方案：在您的 localhost 中维护与您在服务器上的数据大小相似的数据。

php - PHP FPM 7.1 套接字泄漏导致 NGINX - 504 网关超时

1 回答 1

Related

Reference