4

经过大量调查,我发现在服务了数十万个 HTTP POST 请求后,存在内存泄漏。奇怪的是,内存泄漏仅在使用 PyPy 时发生。

这是一个示例代码:

from twisted.internet import reactor
import tornado.ioloop

do_tornado = False
port = 8888

if do_tornado:
    from tornado.web import RequestHandler, Application
else:
    from cyclone.web import RequestHandler, Application

class MainHandler(RequestHandler):
    def get(self):
        self.write("Hello, world")

    def post(self):
        self.write("Hello, world")

if __name__ == "__main__":
    routes = [(r"/", MainHandler)]
    application = Application(routes)

    print port
    if do_tornado:
        application.listen(port)
        tornado.ioloop.IOLoop.instance().start()
    else:
        reactor.listenTCP(port, application)
        reactor.run()

这是我用来生成请求的测试代码:

from twisted.internet import reactor, defer
from twisted.internet.task import LoopingCall

from twisted.web.client import Agent, HTTPConnectionPool
from twisted.web.iweb import IBodyProducer

from zope.interface import implements

pool = HTTPConnectionPool(reactor, persistent=True)
pool.retryAutomatically = False
pool.maxPersistentPerHost = 10
agent = Agent(reactor, pool=pool)

bid_url = 'http://localhost:8888'

class StringProducer(object):
    implements(IBodyProducer)

    def __init__(self, body):
        self.body = body
        self.length = len(body)

    def startProducing(self, consumer):
        consumer.write(self.body)
        return defer.succeed(None)

    def pauseProducing(self):
        pass

    def stopProducing(self):
        pass


def callback(a):
    pass

def error_callback(error):
    pass

def loop():
    d = agent.request('POST', bid_url, None, StringProducer("Hello, world"))
    #d = agent.request('GET', bid_url)
    d.addCallback(callback).addErrback(error_callback)


def main():
    exchange = LoopingCall(loop)
    exchange.start(0.02)

    #log.startLogging(sys.stdout)
    reactor.run()

main()

请注意,此代码不会在 CPython 或 Tornado 和 Pypy 中泄漏!代码仅在同时使用 Twisted 和 Pypy 时泄漏,并且仅在使用 POST 请求时泄漏。

要查看泄漏,您必须发送数十万个请求。

请注意,在设置 PYPY_GC_MAX 时,进程最终会崩溃。

这是怎么回事?

4

1 回答 1

1

原来泄漏的原因是BytesIO模块。

这是在 Pypy 上模拟泄漏的方法。

from io import BytesIO
while True: a = BytesIO()

这是修复: https ://bitbucket.org/pypy/pypy/commits/40fa4f3a0740e3aac77862fe8a853259c07cb00b

于 2014-01-30T08:26:03.453 回答