我在编程方面相当新,我正在尝试编写一个 python 程序,它将按特定列比较 2 个 .csv 文件并检查添加、删除和修改。.csv 文件均采用以下格式,包含相同数量的列,并使用 BillingNumber 作为键:
BillingNumber,CustomerName,IsActive,IsCreditHold,IsPayScan,City,State
"2","CHARLIE RYAN","Yes","No","Yes","Reading","PA"
"3","INSURANCE BILLS","","","","",""
"4","AAA","","","","",""
我只需要比较第 0、1、2 和 4 列。我尝试了许多不同的方法来实现这一点,但我没有任何运气。我知道我可以使用csv.DictReader
or将它们加载到字典中csv.reader
,但之后我就卡住了。在将它们加载到内存后,我不确定从哪里开始或如何开始。
我以前试过这个:
import time
old_lines = set((line.strip() for line in open(r'Old/file1.csv', 'r+')))
file_new = open(r'New/file2.csv', 'r+')
choice = 0
choice = int( input('\nPlease choose your result format.\nEnter 1 for .txt, 2 for .csv or 3 for .json\n') )
time.sleep(1)
print(".")
time.sleep(1)
print("..")
time.sleep(1)
print("...")
time.sleep(1)
print("....")
time.sleep(1)
print('Done! Check "Different" folder for results.\n')
if choice == 1:
file_diff = open(r'Different/diff.txt', 'w')
elif choice == 2:
file_diff = open(r'Different/diff.csv', 'w')
elif choice == 3:
file_diff = open(r'Different/diff.json', "w")
else:
print ("You MUST enter 1, 2 or 3")
exit()
for line in file_new:
if line.strip() not in old_lines:
file_diff.write("** ERROR! Entry "+ line + "** Does not match previous file\n\n")
file_new.close()
file_diff.close()
它不能正常工作,因为如果有额外的行,或者缺少一行,它会将该行之后的所有内容记录为不同的。它还比较了整条线,这不是我想做的。这基本上只是一个起点,虽然它有点工作,但它对我的需要还不够具体。我真的只是在寻找一个好的起点。谢谢!