I have a program that recursively goes through 2 directories and puts the filename:sha256hash into 2 dicts, folder1 and folder2.
What I want to do is a comparison of the hashes and if the hashes match but the key is different, pub the key into a new list called "renamed". I have the logic in place to account for deleted files, new files, and files where the key is the same but the value(hash) is different (a modified file) but can't for the life of me get my head around doing the opposite.
# Put filename:hash into 2 dictionaries from the folders to compare
for root, dirs, files in os.walk(folder_1):
for file in files:
files1[file] = get_hash(os.path.join(root,file))
for root, dirs, files in os.walk(folder_2):
for file in files:
files2[file] = get_hash(os.path.join(root, file))
# Set up the operations to do for the comparison
set_files2, set_files1 = set(files2.keys()), set(files1.keys())
intersect = set_files2.intersection(set_files1)
# Compare and add to list for display
created.extend(set_files2 - intersect)
deleted.extend(set_files1 - intersect)
modified.extend(set(k for k in intersect if files1[k] != files2[k]))
unchanged.extend(set(k for k in intersect if files1[k] == files2[k]))
The issue with this is 1: it doesn't account for renamed files, 2: it puts renamed files into created, so once I have renamed files I have to created = created - renamed to filter those out of actual new files.
Any/all help is appreciated. I've come this far but for some reason my mind is on strike.