3

I have read many questions in stackoverflow, related to python compare directory. However, my current question is bit different.
I have two directories, which contains two different version release package contents. Now I want to compare to ensure the contents are same. However few files have version name embedded into them. Now which is the best possible way to compare them and conclude (except the version difference, all files match).


For example:
Version V1R1C1 contains directory structure as below

pmt> find . -name "*"
.
./c1
./c1/c2
./c1/c1_V1R1C1.cfg
./a1
./a1/a1_V1R1C1.cfg
./a1/a2
./a1/a2/a1a2_V1R1C1.cfg
./b1/a_best_file.txt
./b1/b2/a_test_file.txt
./b1/b2/b1b2_V1R1C1.cfg
./a_V1R1C1.cfg

Version V2R3C1 may contain below structure

pmt> find . -name "*"
.
./c1
./c1/c2
./c1/c1_V2R3C1.cfg
./a1
./a1/a1_V2R3C1.cfg
./a1/a2
./a1/a2/a1a2_V2R3C1.cfg
./b1/a_best_file.txt
./b1/b2/a_test_file.txt
./b1/b2/b1b2_V2R3C1.cfg
./a_V2R3C1.cfg

In the above case, the program must flag it as equivalent structure.

I can think of few solutions - like for example, read both the directory structure recursively into cache (dict), rip the version information and compare etc. But looks like not a completely effective mechanism because of two reason 1. It does not utilize the inbuilt directory compare 2.The multiple read/rip/compare is bound to cost (especially with huge directory tree structure).

I am looking for ideas, which are simple and efficient than the one above.


PS :
1. In case of any difference (except the version unlike the above example), I would like to use the left/right etc to get diffed list.
2. We can assume before hand which is the version name in both directories (like V1R1C1 in first case and V2R3C1 in second case).

4

2 回答 2

2

glob 模块有一个迭代器函数(与列表生成相比),您可能可以在轻型 for 循环中使用它来迭代每个文件条目,然后将差异踢到单独的列表/字典中。

这样你就不会生成大量文件名然后挑选它们。

http://docs.python.org/py3k/library/glob.html#module-glob

于 2012-10-23T18:06:33.193 回答
2

使用集合比较怎么样?

set((remove_version(filepath) for filepath in iter_file(dic1))) == set((remove_version(filepath) for filepath in iter_file(dic2)))
于 2012-10-23T18:08:42.693 回答