python - Pandas/jsonl：如何在包含多个字符串的字符串中添加换行符？

Question

我有一个像这样的 Pandas DataFrame：

sentences = ['First sentence. Second sentence', 'Third sentence. Fourth sentence']
df = pd.DataFrame(sentences, columns =['text_column'])

text_column
'First sentence. Second sentence'
'Third sentence. Fourth sentence'

接下来我把它放在一个像这样的jsonl（JSON LINE FORMAT）格式文件中（给doccano）：

df.to_json(os.path.join(path,'test.jsonl'),orient='records', lines=True,force_ascii=False)

df 的 jsonl 输出：

{'text': 'First sentence. Second sentence'},
{'text': 'Third sentence. Fourth sentence'}

我想在字符串中的每个句子之间添加一个换行符，我尝试过这样的事情：

{'text': 'First sentence.' + "\\n \\n " +  'Second sentence'},
{'text': 'Third sentence.' + "\\n \\n " +  'Fourth sentence'}

但不起作用。也许我可以用 Pandas 格式化它。目标是为字符串中的每个短语换行，因为我首先显示 {'text': 'First sentence. 第二句'}在一页Doccano中。

预期输出：

First sentence.
Second sentence.

和：

Third sentence.
Fourth sentence.

score 0 · Accepted Answer

尝试这个：

sen = [{'text': 'First sentence. Second sentence'},{'text': 'Third sentence. Fourth sentence'}]

new_sen = []
for s in sen:
    for k , v in s.items():
        dct = {}
        dct[k] = ((v.split('.')[0])) + ("\n \n") + ((v.split('.')[1]))
    new_sen.append(dct)

print(new_sen)

输出：

[{'text': 'First sentence\n \n Second sentence'}, {'text': 'Third sentence\n \n Fourth sentence'}]

为了得到预期的输出试试这个：

print(new_sen[0]['text'])
# First sentence
# Second sentence
print(new_sen[1]['text'])
# Third sentence
# Fourth sentence

python - Pandas/jsonl：如何在包含多个字符串的字符串中添加换行符？

1 回答 1

Related

Reference