0

我有如下数据

>P1;gi|467971|gb|AA3.1|

-MASLAALLPLLALLVLCRLDPAQA
QAEPGAGG-LQELALQ---KRGIVE
QCCTSICSLYQLEN---
*
>P1;gi|307072|gb|AAA59179.1|

-MALWMRLLPLLALLALWGPDPAAA
FPK-TR-EAPGAGS-LEGSLQ--KRE
QCCTSICSLYQLENYCN
*
>P1;gi|387059|gb|AAA31.1|

-MALVLALLALWNTNQAFVS-RHLC
FYIPK-DRREG-LQLQ---KRGIVD
QCCTGTCTRHQLQS---
*

在 python 中,我如何将这些转换为如下所示的数据

-MASLAALLPLLALLVLCRLDPAQAQAEPGAGG-LQELALQ---KRGIVEQCCTSICSLYQLEN---,- MALWMRLLPLLALLALWGPDPAAAFPK-TR-EAPGAGS-LEGSLQ--KREQCCTSICSLYQLENYCN,-MALVLALLALWNTNQAFVS-RHLCFYIPK-DRREG-LQLQ---KRGIDQCCTGTCTRHQLQS---

4

2 回答 2

0

考虑到 file1.txt 中的数据可用,那么您可以使用这段代码:

file_handle = open(r'C:\Users\kvivek\Desktop\file1.txt', 'r')
fileContent = file_handle.readlines()
file_handle.close()

output = ''
for line in fileContent:
    if ">P1;gi" in line:
        continue
    x = ''.join(line.strip())
    output = output + x

// replace all * with comma and then use strip function used to remove the last comma
finalOutput = output.replace("*",",").rstrip(',')
print finalOutput
于 2013-02-07T17:45:31.877 回答
0

data你的“字符串”在哪里?

>>> lines = data.replace('*', ',').splitlines()
>>> ''.join(line for line in lines if line and not line.startswith('>')).rstrip(',')

'-MASLAALLPLLALLVLCRLDPAQAQAEPGAGG-LQELALQ---KRGIVEQCCTSICSLYQLEN---,-MALWMRLLPLLALLALWGPDPAAAFPK-TR-EAPGAGS-LEGSLQ--KREQCCTSICSLYQLENYCN,-MALVLALLALWNTNQAFVS-RHLCFYIPK-DRREG-LQLQ---KRGIDQCCTGTCTRHQLQS---'

于 2013-02-07T17:28:55.330 回答