我遇到了一个关键问题。
我的应用程序架构描述如下:
nginx -> web app (express/nodejs) -> api (jetty/java) -> mysql
API 应用程序经过了很好的优化,因此这里无需提及其性能。(约 200 毫秒/请求,100 个请求/秒)
我的网络应用程序:
在做 profile log 时,我注意到 Swig 模板引擎的 HTML 渲染时间阻塞了 I/O 太长,所以它显着增加了其他待处理请求的等待时间。
为了呈现 1MB 的文本/html 响应,Swig 模板需要大约 250 毫秒。
这是我的压力测试的输出:
$ node stress.js 20
Receive response [0] - 200 - 431.682654ms
Receive response [1] - 200 - 419.248099ms
Receive response [2] - 200 - 670.558033ms
Receive response [4] - 200 - 920.763105ms
Receive response [3] - 200 - 986.20115ms
Receive response [7] - 200 - 1521.330763ms
Receive response [5] - 200 - 1622.569327ms
Receive response [9] - 200 - 1424.500137ms
Receive response [13] - 200 - 1643.676996ms
Receive response [14] - 200 - 1595.958319ms
Receive response [10] - 200 - 1798.043086ms
Receive response [15] - 200 - 1551.028243ms
Receive response [8] - 200 - 1944.247382ms
Receive response [6] - 200 - 2044.866157ms
Receive response [11] - 200 - 2162.960215ms
Receive response [17] - 200 - 1941.155794ms
Receive response [16] - 200 - 1992.213563ms
Receive response [12] - 200 - 2315.330372ms
Receive response [18] - 200 - 2571.841722ms
Receive response [19] - 200 - 2523.899486ms
AVG: 1604.10ms
如您所见,请求越晚,等待时间越长。
当我返回响应代码而不是呈现 HTML 时,通过修改一些代码:
function render(req, res, next, model) {
return res.status(200).end(); // add this line
res.render('list', model);
}
压力测试输出变为:
$ node stress.js 20
Receive response [0] - 200 - 147.738725ms
Receive response [1] - 200 - 204.656645ms
Receive response [2] - 200 - 176.583635ms
Receive response [3] - 200 - 218.785931ms
Receive response [4] - 200 - 194.479036ms
Receive response [6] - 200 - 191.531871ms
Receive response [5] - 200 - 265.371646ms
Receive response [7] - 200 - 294.373466ms
Receive response [8] - 200 - 262.097708ms
Receive response [10] - 200 - 282.183757ms
Receive response [11] - 200 - 249.842496ms
Receive response [9] - 200 - 371.228602ms
Receive response [14] - 200 - 236.945983ms
Receive response [13] - 200 - 304.847457ms
Receive response [12] - 200 - 377.766879ms
Receive response [15] - 200 - 332.011981ms
Receive response [16] - 200 - 306.347012ms
Receive response [17] - 200 - 284.942474ms
Receive response [19] - 200 - 249.047099ms
Receive response [18] - 200 - 315.11977ms
AVG: 263.30ms
我之前尝试过一些解决方案,但没有一个可以减少响应时间:
使用节点集群(我的服务器中有 2 个工作人员)
if (conf.cluster) {
// cluster setup
var cluster = require('cluster');
var numCPUs = require('os').cpus().length;
if (cluster.isMaster) {
for (var i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on('exit', function(worker, code, signal) {
console.log('Worker ' + worker.process.pid + ' died');
// create new worker
cluster.fork();
});
} else {
rek('server').listen(conf.port, function() {
console.log('Application started at port ' + conf.port + ' [PID: ' + process.pid + ']');
});
}
} else {
rek('server').listen(conf.port, function() {
console.log('Application started at port ' + conf.port + ' [PID: ' + process.pid + ']');
});
}
使用16 个线程的JXCore(最大线程数)
jx mt-keep:16 app.js
使用NGINX 负载均衡
启动4个节点进程
$ PORT=3000 forever start app.js
$ PORT=3001 forever start app.js
$ PORT=3002 forever start app.js
$ PORT=3003 forever start app.js
nginx.conf
upstream webapp {
server 127.0.0.1:3000;
server 127.0.0.1:3001;
server 127.0.0.1:3002;
server 127.0.0.1:3003;
}
server {
listen 80;
location / {
proxy_pass http://webapp;
}
[...]
}
我认为上述所有解决方案都会提供多个进程/线程,在执行像 HTML 渲染这样的繁重任务时不会相互阻塞,但结果与我的预期不一样:等待时间没有减少。尽管日志显示请求实际上由多个进程/线程提供服务。
我在这里错过任何要点吗?
或者你能告诉我另一种减少等待时间的解决方案吗?