我正在从多个 CSV 文件中读取数据。在下面的示例中,我们只使用一个文件。CSV 包含两列数据,“EFT”和“CTR”。读入数据后,我尝试使用 CTR 值计算积分值,并将这些计算出的积分值存储在名为 ITCER 的新列中。如果我分析大约 10,000 个数据点,整个过程大约需要 15 秒。理想情况下,我将分析来自 8 个不同文件的 25,000 个数据点。因此,它确实需要一些时间。有没有人对如何更改代码以计算 ITCER(集成部分)以使其运行得更快有任何建议。
代码:
import pandas as pd
from pandas import *
#### User input; run numbers - will be fed into the program as part of the file name..
files = [20135501]
#create dataframes to store data temporarily based on EFTs
ITCER = DataFrame()
b = []
#### process data path
shorts = ['Python/CSV/Run Files/' + str(i) + '.csv' for i in files]
#### reads process data
ferms = [pd.read_csv(s) for s in shorts]
csvDFs = [(ferm).apply(pd.Series.interpolate) for ferm in ferms]
for i in range(len(files)):
#Calculations
for j in range(len(ferms[i])):
if j == 0:
c = ferms[i].irow(j)['CTR [mM/h]']
if j > 0:
g = ferms[i].irow(j)['CTR [mM/h]']
h = ferms[i].irow(j-1)['CTR [mM/h]']
c = c + (g+h)/120000
b.append(c)
itcer = ITCER.append(b)
eft = csvDFs[i]['EFT (h)']
#insert calculations
csvDFs[i]['EFT'] = eft
csvDFs[i]['ITCER'] = itcer
#keep specific columns and then save file in folder
csvDFs[i] = csvDFs[i][['EFT','ITCER']]
csvDFs[i].to_csv('Python/CSV/Test/'+ str(files[i])+'.csv', index = False)