尝试分析具有字典标题行的 2 列(颜色 number_of_occurances).tsv 文件。尝试以最通用的方式跳过标题行(假设这是通过要求第二列是 int 类型)。以下是我想出的最好的,但似乎必须有更好的:
filelist = []
color_dict = {}
with open('file1.tsv') as F:
filelist = [line.strip('\n').split('\t') for line in F]
for item in filelist:
try: #attempt to add values to existing dictionary entry
x = color_dict[item[0]]
x += int(item[1])
color_dict[item[0]] = x
except: #if color has not been observed yet (KeyError), or if non-convertable string(ValueError) create new entry
try:
color_dict[item[0]] = int(item[1])
except(ValueError): #if item[1] can't convert to int
pass
似乎应该有更好的方法来处理尝试和异常。
请求文件摘录:
color Observed
green 15
gold 20
green 35