python - 使用 Twisted 通过 FTP 下载文件时如何关闭文件对象？

Question

我有以下代码：

for f in fileListProtocol.files:
    if f['filetype'] == '-':
        filename = os.path.join(directory['filename'], f['filename'])
        print 'Downloading %s...' % (filename)
        newFile = open(filename, 'w+')
        d = ftpClient.retrieveFile(filename, FileConsumer(newFile))
        d.addCallback(closeFile, newFile)

不幸的是，在下载了有问题的目录中的 1000 多个文件中的数百个后，我收到一个关于打开文件过多的 IOError。为什么我应该在下载每个文件后关闭它们？如果有更惯用的方法来完成下载大量文件的整个任务，我很想听听。谢谢。

更新：让-保罗的DeferredSemaphore例子加上马特的FTPFile成功了。出于某种原因，使用 aCooperator而不是DeferredSemaphore会下载一些文件然后失败，因为 FTP 连接会中断。

score 1 · Accepted Answer

您fileListProtocol.files同时打开每个文件，将内容下载到其中，然后在每次下载完成时关闭每个文件。因此，您len(fileListProtocol.files)在流程开始时打开了文件。如果该列表中有太多文件，那么您将尝试打开太多文件。

您可能希望一次将自己限制为一些相当少量的并行下载（如果 FTP 甚至支持并行下载，我不完全确定是这种情况）。

http://jcalderone.livejournal.com/24285.html并将远程调用排队到 Python Twisted 透视代理？可能有助于弄清楚如何限制并行开始的下载数量。

score 1 · Accepted Answer

假设您正在使用FTPClientfrom twisted.protocols.ftp... 在反驳 JP 之前我肯定会犹豫。

FileConsumer您传递给的类似乎retrieveFile将适应IProtocolby twisted.internet.protocol.ConsumerToProtocolAdapter，它不会调用unregisterProducer，因此FileConsumer不会关闭文件对象。

我已经敲定了一个快速协议，您可以使用它来接收文件。我认为它应该只在适当的时候打开文件。完全未经测试，您可以在FileConsumer上面的代码中使用它来代替，并且不需要addCallback.

from twisted.python import log
from twisted.internet import interfaces
from zope.interface import implements

class FTPFile(object):
    """
    A consumer for FTP input that writes data to a file.

    @ivar filename: a filename to be opened for writing.
    """

    implements(interfaces.IProtocol)

    def __init__(self, filename):
        self.fObj = None
        self.filename = filename

    def makeConnection(self,transport)
        self.fObj = open(self.filename,'wb')
        log.info('Opened %s for writing' % self.filename)

    def connectionLost(self,reason):
        self.fObj.close()
        log.info('Closed %s' % self.filename)

    def dataReceived(self, bytes):
        self.fObj.write(bytes)

python - 使用 Twisted 通过 FTP 下载文件时如何关闭文件对象？

2 回答 2

Related

Reference