-2

我有一个 python 脚本,可以搜索目录中的文件,并在计算机运行时无限执行。这是代码:

import fnmatch
import os
import shutil
import datetime
import time
import gc
# This is a python script that removes the "conflicted" copies of
# files that dropbox creates when one computer has the same copy of
# a file as another computer. 
# Written by Alexander Alvonellos
# 05/10/2012

class cleanUpConflicts:
    rootPath = 'D:\Dropbox'
    destDir = 'D:\Conflicted'
    def __init__(self):
        self.removeConflicted()
        return

    def cerr(message):
        f = open('./LOG.txt', 'a')
        date = str(datetime.datetime.now())
        s = ''
        s += date[0:19] #strip floating point
        s += ' : '
        s += str(message)
        s += '\n'
        f.write(s)
        f.close()
        del f
        del s
        del date
        return

    def removeConflicted(self):
        matches = []
        for root, dirnames, filenames in os.walk(self.rootPath):
            for filename in fnmatch.filter(filenames, '*conflicted*.*'):
                matches.append(os.path.join(root, filename))
                cerr(os.path.join(root, filename))
                shutil.move(os.path.join(root, filename), os.path.join(destDir, filename))
        del matches
            return

def main():
    while True:
        conf = cleanUpConflicts()
        gc.collect()
        del conf
        reload(os)
        reload(fnmatch)
        reload(shutil)
        time.sleep(10)
    return

main()

反正。有一个内存泄漏,每十秒左右增加近 1 兆。我不明白为什么没有释放内存。到最后,这个脚本会不断地吃掉内存,甚至不用尝试。这令人沮丧。有人有任何提示吗?我什么都试过了,我想。

这是进行此处建议的一些更改后的更新版本:

import fnmatch
import os
import shutil
import datetime
import time
import gc
import re
# This is a python script that removes the "conflicted" copies of 
# files that dropbox creates when one computer has the same copy of
# a file as another computer. 
# Written by Alexander Alvonellos
# 05/10/2012

rootPath = 'D:\Dropbox'
destDir = 'D:\Conflicted'

def cerr(message):
    f = open('./LOG.txt', 'a')
    date = str(datetime.datetime.now())
    s = ''
    s += date[0:19] #strip floating point
    s += ' : '
    s += str(message)
    s += '\n'
    f.write(s)
    f.close()
    return


def removeConflicted():
    for root, dirnames, filenames in os.walk(rootPath):
        for filename in fnmatch.filter(filenames, '*conflicted*.*'):
            cerr(os.path.join(root, filename))
            shutil.move(os.path.join(root, filename), os.path.join(destDir, filename))
return


def main():
    #while True:
    for i in xrange(0,2):
        #time.sleep(1)
        removeConflicted()
        re.purge()
        gc.collect()
    return
main()

我已经对这个问题进行了一些研究,似乎 fnmatch 中可能存在一个错误,它有一个使用后不会清除的正则表达式引擎。这就是我调用 re.purge() 的原因。我已经修改了几个小时了。

我还发现这样做:

print gc.collect()

每次迭代返回 0。

谁对我投了反对票,显然是错误的。我真的需要一些帮助。这是我正在谈论的链接:为什么我用这个 python 循环泄漏内存?

4

2 回答 2

0

猜测一下,有些东西会保留对每次主迭代创建的实例的引用。

建议:

  1. 放下课程并制作1-2个功能
  2. 丢火柴;不使用?
  3. 查看 inotify (Linux) 或类似的 Windows;它可以监视目录并仅在需要时采取行动;没有连续扫描
于 2012-05-10T21:58:55.223 回答
0

您的代码可以缩短为:

import fnmatch
import os
import shutil
import datetime
import time

ROOT_PATH = r'D:/Dropbox'
DEST_DIR = r'D:/Conflicted'

def cerr(message, log):
    date = str(datetime.datetime.now())
    msg = "%s : %s\n" % (date[0:19], message)
    log.write(msg)

def removeConflicted(log):
    for root, dirnames, filenames in os.walk(ROOT_PATH):
        for filename in fnmatch.filter(filenames, '*conflicted*.*'):
            # 1: comment out this line and check for leak
            cerr(os.path.join(root, filename), log)
            # 2: then comment out this line instead and check
            shutil.move(
                os.path.join(root, filename), 
                os.path.join(DEST_DIR, filename))


def main():
    with open('./LOG.txt', 'a') as log:
        while True:
            print "loop"
            removeConflicted(log)
            time.sleep(10)

if __name__ == "__main__":
    main()

如果没有要处理的文件,请查看是否发生内存泄漏。也就是说,将其指向空目录并确定在移动时是否发生泄漏。
您不需要re.purge()or 来弄乱gc模块。

于 2012-05-10T22:44:00.320 回答