-1

我在“描述”行中有一个包含三个已定义元素的 FASTA 文件。

第一个元素,定义为dato[0],是必须执行条件的元素,第三个元素,定义为dato[2],是我想要求和的元素。FASTA 描述行是这样的:

PIN4 HOIAQKS02C4SWQ 1761
PIN1 HOIAQKS02D3JZ3 572

我想对在一行中执行条件和在另一行dato[2]中执行条件的值 ()求和。dato[0] == PIN1dato[0] == PIN4

我正在使用以下代码:

from Bio import SeqIO

secuencias=SeqIO.parse("/Users/imac/Desktop/Pruebas_UniFrac/otu1_alpin1+4.fasta", "fasta")

PIN_records=list(SeqIO.parse("/Users/imac/Desktop/Pruebas_UniFrac/otu1_alpin1+4.fasta", "fasta")

archivo1=open("/Users/imac/Desktop/Pruebas_UniFrac/pruebaalpin1+4_fin.fasta", "w")
archivo2=open("/Users/imac/Desktop/Pruebas_UniFrac/pruebaalpin1+4_seqsotus.fasta", "w")
archivo3=open("/Users/imac/Desktop/Pruebas_UniFrac/pruebaalpin1+4_sumas.fasta", "w")

x = 0
y = x+1
for linea in secuencias:
    dato = linea.description.split(" ")
    seqs = str(linea.seq)

    if dato[0] != "PIN1":
        if dato[0] != "PIN4":
            if dato[0] == "consensus":
               archivo1.write("hacia arriba OTU" + str(y) + "\n" + "x" + "\n" + "x" + "\n")
               archivo2.write(">" + "OTU" + str(y) + "\n" + seqs + "\n")
               archivo3.write("fin del OTU" + "\n")
               y = y+1
        else:
         archivo1.write(str(dato[0]) + "," + str(dato[2]) + "\n")
         #num = int(dato[2])
         #archivo3.write("PIN4=" + str(sum(dato[2])) + "\n")
         #archivo3.write("PIN4=%d\n" % sum(dato[2]))
         archivo3.write("PIN4={}\n".format(sum(dato[2])))
    else:
     archivo1.write(str(dato[0]) + "," + str(dato[2]) + "\n")
     #num = int(dato[2])
     #archivo3.write("PIN1=" + str(sum(dato[2])) + "\n")
     #archivo3.write("PIN1=%d\n" % sum(dato[2]))
     archivo3.write("PIN1={}\n".format(sum(dato[2])))

archivo1.close()
archivo2.close()
archivo3.close()

当我这样做时,我收到以下错误消息:

TypeError: unsupported operand type(s) for +: 'int' and 'str'

我怎样才能做到这一点?

在遵循后人评论之后,我在我的代码中引入了更改,但我无法让它正常工作,我不知道如何修复它。

非常感谢

使用此代码,我收到以下错误:

File "./lectura_msaout_pruebaalpin1+4_final.py", line 16
    archivo1=open("/Users/imac/Desktop/Pruebas_UniFrac/pruebaalpin1+4_fin.fasta", "w")
           ^
SyntaxError: invalid syntax 
4

2 回答 2

0

Your code has two main issues.

  1. You're trying to call sum() on string data.
  2. You're trying to format a numeric value as a string.

Fixing summation

You want to sum an iterable of numeric values, as summing is undefined for string values. You can convert string values to an integer by calling int() on each value (use the map() function to do this).

Example:

>>> sum(["1", "2", "3"])
TypeError: unsupported operand type(s) for +: 'int' and 'str'
>>> sum([1, 2, 3])
6
map(int, ["1", "2", "3"])
[1, 2, 3]
>>> sum(map(int, ["1", "2", "3"]))
6

Application to your code

Do you really want to sum the single digits of dato[2]? It'd look like this:

>>> dato = ['PIN4', 'HOIAQKS02C4SWQ', '1761']
>>> sum(map(int, dato[2]))  # 1 + 7 + 6 + 1
15

Fixing the string formatting

You can't append an integer to a string (see Python String and Integer concatenation).

The solution is to either convert the integer to a string before concatenating, or to format the integer within a string. In your case, the solutions look like this:

  1. Convert to string:

    archivo3.write("PIN1=" + str(dato_2_sum) + "\n")
    
  2. Use string formatting:

    archivo3.write("PIN1=%d\n" % dato_2_sum)
    
  3. Use newstyle formatting:

    archivo3.write("PIN1={}\n".format(dato_2_sum)
    
于 2013-07-02T16:49:41.320 回答
0

最后,我通过在“for”循环之外创建计数器并创建一个总和但没有“sum”命令以及在“str”和“int”之间进行更改来解决我的问题。我的“几乎完成”完整代码如下:

#!/usr/bin/python


from Bio import SeqIO



sequences=SeqIO.parse("/Users/imac/Desktop/Pruebas_UniFrac/otu1_alpin1+4.fasta", "fasta")





file1=open("/Users/imac/Desktop/Pruebas_UniFrac/pruebaalpin1+4_fin.fasta", "w")
file2=open("/Users/imac/Desktop/Pruebas_UniFrac/pruebaalpin1+4_seqsotus.fasta", "w")
file3=open("/Users/imac/Desktop/Pruebas_UniFrac/pruebaalpin1+4_sumas.fasta", "w")


numTotalPin1=0
numTotalPin4=0



x=0
y=x+1

for line in sequences:


    data=line.description.split(" ")



    seqs=str(line.seq)


    if data[0]!="PIN1":
        if data[0]!="PIN4":
            if data[0]=="consensus":
               file1.write("upstream OTU" + str(y) + "\n" + "x" + "\n" + "x" + "\n")
               file2.write(">" + "OTU" + str(y) +"\n" + seqs + "\n")
               file3.write("OTU"+ str(y) + "\n")
               file3.write("PIN1=" + str(numTotalPin1) + "\n")
               file3.write("PIN4=" + str(numTotalPin4) + "\n")
               file3.write("end of OTU"+ str(y) + "\n")
               y=y+1
               numTotalPin1=0
               numTotalPin4=0
        else:
         file1.write(str(data[0]) + "," + str(data[2]) + "\n")
         num=int(data[2])
         numTotalPin4=numTotalPin4 + int(data[2])


    else:
     file1.write(str(data[0]) + "," + str(data[2]) + "\n")
     num=int(data[2])
     numTotalPin1=numTotalPin1 + int(data[2])



file1.close()
file2.close()
file3.close()

我希望有人能发现这段代码有帮助。谢谢你的帮助。

于 2013-07-09T10:30:23.493 回答