python - NDB 在长时间请求期间未清除内存

Question

我目前正在将长时间运行的作业卸载到 TaskQueue 以计算数据存储区中 NDB 实体之间的连接。

基本上，这个队列处理几个实体键列表，这些实体键将通过节点中query的node_in_connected_nodes函数与另一个相关联：GetConnectedNodes

class GetConnectedNodes(object):
"""Class for getting the connected nodes from a list of nodes in a paged way"""
def __init__(self, list, query):
    # super(GetConnectedNodes, self).__init__()
    self.nodes = [ndb.model.Key('Node','%s' % x) for x in list]
    self.cursor = 0
    self.MAX_QUERY = 100
    # logging.info('Max query - %d' % self.MAX_QUERY)
    self.max_connections = len(list)
    self.connections = deque()
    self.query=query

def node_in_connected_nodes(self):
    """Checks if a node exists in the connected nodes of the next node in the 
       node list.
       Will return False if it doesn't, or the list of evidences for the connection
       if it does.
       """
    while self.cursor < self.max_connections:
        if len(self.connections) == 0:
            end = self.MAX_QUERY
            if self.max_connections - self.cursor < self.MAX_QUERY:
                end = self.max_connections - self.cursor
            self.connections.clear()
            self.connections = deque(ndb.model.get_multi_async(self.nodes[self.cursor:self.cursor+end]))

        connection = self.connections.popleft()
        connection_nodes = connection.get_result().connections

        if self.query in connection_nodes:
            connection_sources = connection.get_result().sources
            # yields (current node index in the list, sources)
            yield (self.cursor, connection_sources[connection_nodes.index(self.query)])
        self.cursor += 1

这里 aNode有一个重复的属性connections，其中包含一个具有其他键 ID 的数组，以及一个与给定连接Node匹配的数组。sources

产生的结果存储在 blobstore 中。

现在我遇到的问题是，在连接函数的迭代之后，内存没有以某种方式被清除。以下日志显示了 AppEngine 在创建新GetConnectedNodes实例之前使用的内存：

I 2012-08-23 16:58:01.643 Prioritizing HGNC:4839 - mem 32
I 2012-08-23 16:59:21.819 Prioritizing HGNC:3003 - mem 380
I 2012-08-23 17:00:00.918 Prioritizing HGNC:8932 - mem 468
I 2012-08-23 17:00:01.424 Prioritizing HGNC:24771 - mem 435
I 2012-08-23 17:00:20.334 Prioritizing HGNC:9300 - mem 417
I 2012-08-23 17:00:48.476 Prioritizing HGNC:10545 - mem 447
I 2012-08-23 17:01:01.489 Prioritizing HGNC:12775 - mem 485
I 2012-08-23 17:01:46.084 Prioritizing HGNC:2001 - mem 564
C 2012-08-23 17:02:18.028 Exceeded soft private memory limit with 628.609 MB after servicing 1 requests total

除了一些波动之外，即使没有访问以前的值，内存也会不断增加。我发现很难调试这个或弄清楚我是否在某个地方有内存泄漏，但我似乎已经将它追溯到那个类。将不胜感激任何帮助。

score 10 · Accepted Answer

我们有类似的问题（长时间运行的请求）。我们通过关闭默认的 ndb 缓存解决了这些问题。你可以在这里阅读更多关于它的信息

score 1 · Accepted Answer

在我们的案例中，这是由启用 AppEngine Appstats引起的。

禁用后，内存消耗恢复正常。

score -3 · Accepted Answer

-3

您可以在每个请求开始时调用 gc.collect() 。

于 2012-08-23T21:44:24.793 回答

python - NDB 在长时间请求期间未清除内存

3 回答 3

Related

Reference