python-2.7 - 为什么在 Python 2.7 中写入输出文件时将 ^M 添加到文件的最后一行？

Question

这是我的文件，a.tsv

ENST00000330436 chr4    -       96099729        96125021
ENST00000332884 chr4    -       96518062        96549623
ENST00000651514 chr5    -       145620969       145647819
ENST00000550308 chr17   +       32532671        32551233
ENST00000371270 chr4    -       96294895        96343068^M

我在 Python 2.7 中使用过，

with open(a.tsv, 'wb') as f_output:
    tsv_output = csv.writer(f_output, delimiter='\n')
    tsv_output.writerow(output_unique)

生成上述 a.tsv 文件。我看到文件最后一行的末尾有'^M'，我在哪里对上面的代码进行更改，以删除它？

output_unique=[string1, string2, string3] ---> 这是一个 Python 列表

score 0 · Accepted Answer

您的 TSV 文件使用 DOS 行尾 (CRLF)，而awk需要 POSIX 行尾。问题是您使用一次调用来编写多行writerow; \n就您的 Python 代码而言，它生成了用于分隔字段的单行。单行以结尾\r\n，但awk将其视为 TSV 文件，其最后一行的最后一个字段以\r.

Python 代码应该类似于

output_unique = [
    ['ENST00000330436', 'chr4', '-', '96099729', '96125021'],
    ['ENST00000332884', 'chr4', '-', '96518062', '96549623'],
    ['ENST00000651514', 'chr5', '-', '145620969', '145647819'],
    ['ENST00000371270', 'chr4', '-', '96294895', '96343068'],
]

with open("a.tsv", 'w') as f_output:
    tsv_output = csv.write(f_output, delimiter='\t')
    tsv_output.writerows(output_unique)

python-2.7 - 为什么在 Python 2.7 中写入输出文件时将 ^M 添加到文件的最后一行？

1 回答 1

Related

Reference