我有一个像这样的 Pandas DataFrame:
sentences = ['First sentence. Second sentence', 'Third sentence. Fourth sentence']
df = pd.DataFrame(sentences, columns =['text_column'])
text_column
'First sentence. Second sentence'
'Third sentence. Fourth sentence'
接下来我把它放在一个像这样的jsonl(JSON LINE FORMAT)格式文件中(给doccano):
df.to_json(os.path.join(path,'test.jsonl'),orient='records', lines=True,force_ascii=False)
df 的 jsonl 输出:
{'text': 'First sentence. Second sentence'},
{'text': 'Third sentence. Fourth sentence'}
我想在字符串中的每个句子之间添加一个换行符,我尝试过这样的事情:
{'text': 'First sentence.' + "\\n \\n " + 'Second sentence'},
{'text': 'Third sentence.' + "\\n \\n " + 'Fourth sentence'}
但不起作用。也许我可以用 Pandas 格式化它。目标是为字符串中的每个短语换行,因为我首先显示 {'text': 'First sentence. 第二句'}在一页Doccano中。
预期输出:
First sentence.
Second sentence.
和:
Third sentence.
Fourth sentence.