20

I am working on a web service implemented on top of nginx+gunicorn+django. The clients are smartphone applications. The application needs to make some long running calls to external APIs (Facebook, Amazon S3...), so the server simply queues the job to a job server (using Celery over Redis).

Whenever possible, once the server has queued the job, it returns right away, and the HTTP connection is closed. This works fine and allows the server to sustain very high load.

client                   server                 job server
  .                        |                        |
  .                        |                        |
  |------HTTP request----->|                        |
  |                        |--------queue job------>|
  |<--------close----------|                        |
  .                        |                        |
  .                        |                        |

But in some cases, the client needs to get the result as soon as the job is finished. Unfortunately, there's no way the server can contact the client once the HTTP connection is closed. One solution would be to rely on the client application polling the server every few seconds until the job is completed. I would like to avoid this solution, if possible, mostly because it would hinder the reactiveness of the service, and also because it would load the server with many unnecessary poll requests.

In short, I would like to keep the HTTP connection up and running, doing nothing (except perhaps sending a whitespace every once in a while to keep the TCP connection alive, just like Amazon S3 does), until the job is done, and the server returns the result.

client                   server                 job server
  .                        |                        |
  .                        |                        |
  |------HTTP request----->|                        |
  |                        |--------queue job------>|
  |<------keep-alive-------|                        |
  |         [...]          |                        |
  |<------keep-alive-------|                        |
  |                        |<--------result---------|
  |<----result + close-----|                        |
  .                        |                        |
  .                        |                        |

How can I implement long running HTTP connections in an efficient way, assuming the server is under very high load (it is not the case yet, but the goal to be able to sustain the highest possible load, with hundreds or thousands of requests per second)?

Offloading the actual jobs to other servers should ensure a low CPU usage on the server, but how can I avoid processes piling up and using all the server's RAM, or incoming requests being dropped because of too many open connections?

This is probably mostly a matter of configuring nginx and gunicorn properly. I have read a bit about async workers based on greenlets in gunicorn: the documentation says that async workers are used by "Applications making long blocking calls (Ie, external web services)", this sounds perfect. It also says "In general, an application should be able to make use of these worker classes with no changes". This sounds great. Any feedback on this?

Thanks for your advices.

4

1 回答 1

34

我正在回答我自己的问题,也许有人有更好的解决方案。

进一步阅读gunicorn 的文档,并阅读更多关于eventletgevent的内容,我认为 gunicorn 完美地回答了我的问题。Gunicorn 有一个管理工人池的主进程。每个工作人员可以是同步的(单线程,一次处理一个请求)或异步的(每个工作人员实际上几乎同时处理多个请求)。

同步工作者很容易理解和调试,如果工作者失败,只会丢失一个请求。但是如果一个工作人员被困在一个长时间运行的外部 API 调用中,它基本上是在睡觉。因此,在高负载的情况下,所有工作人员可能最终会在等待结果时进入睡眠状态,并且请求最终会被丢弃。

所以解决的办法就是把默认的worker类型从同步改为异步(选择eventlet或者gevent,这里比较一下)。现在每个worker运行多个绿色线程,每个线程都非常轻量级。每当一个线程必须等待某个 I/O 时,另一个绿色线程就会恢复执行。这称为协作多任务。它非常快,非常轻量级(如果他们正在等待 I/O,单个工作人员可以处理数千个并发请求)。正是我需要的。

我想知道我应该如何更改现有代码,但显然标准 python 模块在启动时由 gunicorn 进行猴子修补(实际上是通过 eventlet 或 gevent),因此所有现有代码都可以在不更改的情况下运行,并且仍然可以与其他线程很好地运行。

gunicorn 中有很多参数可以调整,例如使用 gunicorn 的worker_connections参数的最大同时客户端数,使用参数的最大挂起连接backlog数等。

太好了,我马上开始测试!

于 2012-08-09T16:16:43.783 回答