2

Based on the biopython help page here, I can filter the alignment columns based on first or last 10, I can even piece together subalignment using

align[:, :10] + align[:, -10:]

align being an MSA object, generated using

from Bio import AlignIO
align = AlignIO.read("Clustalw/opuntia.aln", "clustal")

But, is it possible to, say extract column based on list of position. For example, if i have a following list:

a=[12, 52, 68,45]

Is there a way to extract just these columns from the alignment align.

An R package called bio3d comes in handy to filter alignment by providing list as input (by doing: filtered_align = align[, a]), but would be great if i can use this from python.

Thank you

4

1 回答 1

2

According to the Biopython docs, you can get column x with

align[:, x]

So the following should do the job for you:

from Bio import AlignIO

align = AlignIO.read("Clustalw/opuntia.aln", "clustal")
indices = [12, 52, 68, 45]
columns_as_strings = []

for column in indices:
    columns_as_strings.append(align[:, column])
于 2014-05-19T19:36:46.007 回答