我有 stocktwits 数据,我正在尝试通过标记来转换文本文件。我正在尝试使用 .json 读取 json 格式json.loads(line)
。
import json
with open(self.trainables_path + file_to_write, "w") as fp:
for doc in enumerate(files):
with open(self.tweets_path + doc[1]) as f:
for line in f:
entry = json.loads(line)
user = list(entry.keys())[0]
tweet = entry[user]
fp.write(user + "_" + str(tweet['id']) + " ")
for token in tweet['tokens']:
if tweet['tokens'].index(token) != len(tweet['tokens']) - 1:
fp.write(token + " ")
else:
fp.write(token)
fp.write("\n")
这是错误:
Traceback (most recent call last):
File "H:/dissertation/rec-sys-master/embeddings/parser.py", line 201, in <module>
pd.convert_to_trainable()
File "H:/dissertation/rec-sys-master/embeddings/parser.py", line 169, in convert_to_trainable
fp.write(user + "_" + str(tweet['id']) + " ")
TypeError: string indices must be integers
数据需要写入文件。
编辑:代码中的其他函数没有被执行,这就是数据格式不正确的原因!!