我需要浏览一个包含大约一万个文件的文件夹。我的旧 vbscript 处理这个速度很慢。从那时起我开始使用 Ruby 和 Python,我在这三种脚本语言之间做了一个基准测试,看看哪种语言最适合这项工作。
下面对共享网络上 4500 个文件子集的测试结果是
Python: 106 seconds
Ruby: 5 seconds
Vbscript: 124 seconds
Vbscript 最慢并不奇怪,但我无法解释 Ruby 和 Python 之间的区别。我对 Python 的测试不是最优的吗?有没有更快的方法在 Python 中做到这一点?
thumbs.db 的测试只是为了测试,实际上还有更多的测试要做。
我需要一些东西来检查路径上的每个文件,并且不会产生太多输出以免干扰时间。每次运行的结果都有点不同,但差别不大。
#python2.7.0
import os
def recurse(path):
for (path, dirs, files) in os.walk(path):
for file in files:
if file.lower() == "thumbs.db":
print (path+'/'+file)
if __name__ == '__main__':
import timeit
path = '//server/share/folder/'
print(timeit.timeit('recurse("'+path+'")', setup="from __main__ import recurse", number=1))
'vbscript5.7
set oFso = CreateObject("Scripting.FileSystemObject")
const path = "\\server\share\folder"
start = Timer
myLCfilename="thumbs.db"
sub recurse(folder)
for each file in folder.Files
if lCase(file.name) = myLCfilename then
wscript.echo file
end if
next
for each subfolder in folder.SubFolders
call Recurse(subfolder)
next
end Sub
set folder = oFso.getFolder(path)
recurse(folder)
wscript.echo Timer-start
#ruby1.9.3
require 'benchmark'
def recursive(path, bench)
bench.report(path) do
Dir["#{path}/**/**"].each{|file| puts file if File.basename(file).downcase == "thumbs.db"}
end
end
path = '//server/share/folder/'
Benchmark.bm {|bench| recursive(path, bench)}
编辑:因为我怀疑打印导致延迟,所以我测试了打印所有 4500 个文件并且不打印的脚本,差异仍然存在,第一种情况是 R:5 P:107,后一种情况是 R:4.5 P:107
EDIT2:根据此处的答案和评论,Python 版本在某些情况下可以通过跳过文件夹运行得更快
import os
def recurse(path):
for (path, dirs, files) in os.walk(path):
for file in files:
if file.lower() == "thumbs.db":
print (path+'/'+file)
def recurse2(path):
for (path, dirs, files) in os.walk(path):
for dir in dirs:
if dir in ('comics'):
dirs.remove(dir)
for file in files:
if file.lower() == "thumbs.db":
print (path+'/'+file)
if __name__ == '__main__':
import timeit
path = 'f:/'
print(timeit.timeit('recurse("'+path+'")', setup="from __main__ import recurse", number=1))
#6.20102692
print(timeit.timeit('recurse2("'+path+'")', setup="from __main__ import recurse2", number=1))
#2.73848228
#ruby 5.7