标准库中的每个任务都有一个工具:
要遍历目录中的所有 CSV 文件,请使用glob
模块:
import glob
for csvfilename in glob.glob(r"C:\mydirectory\*.csv"):
#do_something
要解析 CSV 文件,请使用csv
模块:
import csv
with open(csvfilename, "rb") as csvfile:
reader = csv.reader(csvfile, delimiter=",")
for row in reader:
# row is a list of all the entries in the current row
要解析日期并计算差异,请使用datetime
模块:
from datetime import datetime
startdate = datetime.strptime("1999-10-20", "%Y-%m-%d")
enddate = datetime.strptime("2003-02-28", "%Y-%m-%d")
delta = enddate - startdate # difference in days
要将值添加到行的开头:
row[0:0] = [str(delta)]
要将文件名附加到行尾:
row.append(csvfilename)
并将一行写入新的 CSV 文件:
with open(csvfilename, "wb") as csvfile:
writer = csv.writer(csvfile, delimiter=",")
writer.writerow(row)
综合起来,你得到:
import glob
import csv
from datetime import datetime
with open("combined_files_csv", "wb") as outfile:
writer = csv.writer(outfile, delimiter=",")
for csvfilename in glob.glob(r"C:\mydirectory\*.csv"):
with open(csvfilename, "rb") as infile:
reader = csv.reader(infile, delimiter=",")
for row in reader:
startdate = datetime.strptime(row[3], "%Y-%m-%d")
enddate = datetime.strptime(row[2], "%Y-%m-%d")
delta = enddate - startdate # difference in days
row[0:0] = [str(delta)]
row.append(csvfilename)
writer.writerow(row)