我有一组 Excel 工作表,每个设置如下:
ID | imageName
--------------
1 abc.jpg
2 def.bmp
3 abc.jpg
4 xyz123.jpg
此工作表对应于一个文件夹,其内容如下:
abc.pdf
ghijkl.pdf
def.pdf
def.xls
x-abc.pdf
我正在尝试生成一个报告,该报告将每个实例imageName
的最低值ID
与与之匹配的 PDF 相匹配,并且还识别imageName
工作表中不匹配的 PDF 和文件夹中不匹配的 PDF。带有“x-”前缀的文件名等同于没有前缀的文件名,因此该数据集的报告如下:
ID imageName filename
-----------------------
1 abc.jpg abc.pdf
1 abc.jpg x-abc.pdf
2 def.bmp def.pdf
4 xyz123.jpg
ghijkl.pdf
我目前的解决方案如下:
'sheetObj is the imageName set, folderName is the path to the file folder
sub makeReport(sheetObj as worksheet,folderName as string)
dim fso as new FileSystemObject
dim imageDict as Dictionary
dim fileArray as variant
dim ctr as long
'initializes fileArray for storing filename/imageName pairs
redim fileArray(1,0)
'returns a Dictionary where key is imageName and value is lowest ID for that imageName
set imageDict=lowestDict(sheetObj)
'checks all files in folder and populates fileArray with their imageName matches
for each file in fso.getfolder(folderName).files
fileFound=false
'gets extension and checks if it's ".pdf"
if isPDF(file.name) then
for each key in imageDict.keys
'checks to see if base names are equal, accounting for "x-" prefix
if equalNames(file.name,key) then
'adds a record to fileArray mapping filename to imageName
addToFileArray fileArray,file.path,key
fileFound=true
end if
next
'checks to see if filename did not match any dictionary entries
if fileFound=false then
addToFileArray fileArray,file.path,""
end if
end if
next
'outputs report of imageDict entries and their matches (if any)
for each key in imageDict.keys
fileFound=false
'checks for all fileArray matches to this imageName
for ctr=0 to ubound(fileArray,2)
if fileArray(0,ctr)=key then
fileFound=true
'writes the data for this match to the worksheet
outputToExcel sheetObj,key,imageDict(key),fileArray(0,ctr)
end if
next
'checks to see if no fileArray match was found
if fileFound=false then
outputToExcel sheetObj,key,imageDict(key),""
end if
next
'outputs unmatched fileArray entries
for ctr=0 to ubound(fileArray,2)
if fileArray(1,ctr)="" then
outputToExcel sheetObj,"","",fileArray(0,ctr)
end if
next
该程序成功输出报告,但速度很慢。由于嵌套的 For 循环,随着imageName
条目和文件数量的增长,处理它们的时间呈指数增长。
有没有更好的方法来检查这些集合中的匹配项?如果我制作fileArray
成字典可能会更快,但字典不能有重复的键,并且此数据结构需要在其字段中具有重复的条目,因为文件名可能匹配多个图像名称,反之亦然。