python - Python中字典中行拆分的制表符

Question

我正在尝试运行一个 python 程序，该程序可以从一个文件中运行一个字典，其中包含一个单词列表，每个单词都有一个分数和标准差。我的程序如下所示：

theFile = open('word-happiness.csv' , 'r')

theFile.close()

def make_happiness_table(filename):
   '''make_happiness_table: string -> dict
      creates a dictionary of happiness scores from the given file'''
   with open(filename) as f:
      d = dict( line.split('    ')  for line in f)
   return d

make_happiness_table("word-happiness.csv")

table = make_happiness_table("word-happiness.csv")
(score, stddev) = table['hunger']
print("the score for 'hunger' is %f" % score)

我的 .csv 文件格式为

word{TAB}score{TAB}standard_deviation

我正在尝试以这种方式创建字典。如何创建这样的字典，以便从函数中打印诸如“饥饿”之类的单词并获取其分数和标准偏差？

score 1 · Accepted Answer

def make_happiness_table(filename):
   with open(filename) as f:
      d = dict()
      for line in f:
         word,score,std = line.split() #splits on any consecutive runs of whitspace
         d[word]=score,std # May want to make floats:  `d[word] = float(score),float(std)`
   return d

请注意，如果您word可以在其中包含一个tab字符，但您保证只有 3 个字段（word、score、std），您可以从右侧（str.rsplit）拆分字符串，只拆分两次（导致 3 个字段在结束）。例如word,score,std = line.rsplit(None,2)。

正如上面评论中提到的，您还可以使用该csv模块来读取这些类型的文件——csv如果您的字段可以被“引用”，那真是太棒了。例如：

"this is field 0" "this is field 1" "this is field 2"

如果您没有这种情况，那么我发现它str.split工作得很好。

另外，不相关，但是您的代码调用make_happiness_table了两次（第一次您没有将返回值分配给任何东西）。第一个调用是无用的（它所做的只是读取文件并构建一个您永远无法使用的字典）。最后，脚本开头的opening 和closeing也只是一种浪费，因为你没有对那里的文件做任何事情。theFile

score 1 · Accepted Answer

如果您确定您的单词没有空格，您可以拆分行，例如

word, score, stddev = line.split()

但是，如果单词可以有空格，请使用制表符\t进行拆分，例如

word, score, stddev = line.split('\t')

但是对于一个非常通用的情况，当单词本身可能有选项卡时，请使用 csv 模块

reader = csv.reader(filename, dialect='excel-tab')
for word, score, stddev  in reader:
    ...

然后你可以创建单词和分数的字典，stddev 例如

word_dict[word] = (score, stddev)

python - Python中字典中行拆分的制表符

2 回答 2

Related

Reference