你很接近。这就是你所要做的:
with open(sys.argv[1]) as ifile, open(sys.argv[2], mode = 'w') as ofile:
for row in ifile:
#...
#You've defined some_condition to be met (you will have to replace this for yourself)
#E.g.: the number of entries in each row is greater than 5:
if len([term for term in row.split('#') if term.strip() != '']) > 5:
ofile.write(row)
更新:
要回答 OP 关于分割线的问题:
您通过提供分隔符在 Python 中分割一行。由于这是一个 CSV 文件,因此您用,
. 例子:
如果这是一行(字符串):
0, 1, 2, 3, 4, 5
如果您申请:
line.split(',')
您将获得一份清单:
['0', '1', '2', '3', '4', '5']
更新 2:
import sys
if __name__ == '__main__':
ticker = sys.argv[3]
allTypes = bool(int(sys.argv[4])) #argv[4] is a string, you have to convert it to an int, then to a bool
with open(sys.argv[1]) as ifile, open(sys.argv[2], mode = 'w') as ofile:
all_timestamps = [] #this is an empty list
n_rows = 0
for row in ifile:
#This splits the line into constituent terms as described earlier
#SAMPLE LINE:
#A,1,12884902522,B,B,4900,AAIR,0.1046,28800,390,B,AARCA,
#After applying this bit of code, the line should be split into this:
#['A', '1', '12884902522', 'B', 'B', '4900', 'AAIR', '0.1046', '28800', '390', 'B', 'AARCA']
#NOW, you can make comparisons against those terms. :)
terms = [term for term in row.split(',') if term.strip() != '']
current_timestamp = int(terms[2])
#compare the current against the previous
#starting from row 2: (index 1)
if n_rows > 1:
#Python uses circular indices, hence: -1 means the value at the last index
#That is, the previous time_stamp. Now perform the comparison and do something if that criterion is met:
if current_timestamp - all_timestamp[-1] >= 0:
pass #the pass keyword means to do nothing. You'll have to replace it with whatever code you want
#increment n_rows every time:
n_rows += 1
#always append the current timestamp to all the time_stamps
all_timestamps.append(current_timestamp)
if (terms[6] == ticker):
# add something to make sure chronological order hasn't been broken
if (allTypes == 1):
ofile.write(row)
#I don't know if this was a bad indent of not, but you should know
#where this goes
elif (terms[0] == "A" or terms[0] == "M" or terms[0] == "D"):
print row
ofile.write(row)
我原来的猜想是正确的。您没有将行拆分为 CSV 组件。因此,当您对行进行比较时,您没有得到正确的结果 - 因此,您没有得到任何输出。这现在应该可以工作(根据您的目标进行轻微修改)。:)