我有一个看起来像这样的 bam 文件:
samtools view pingpon.forward.bam | head
K00311:84:HYCNTBBXX:1:1123:2909:4215 0 LQNS02000001.1:55-552 214 28M * 0 0 TCTAGTTCAACTGTAAATCATCCTGCCC AAFFFJJJJJJJJJJJJJJJJJJJJJJJ AS:i:-6 XS:i:-6 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:9T18 YT:Z:UU
K00311:84:HYCNTBBXX:1:1123:2909:4215 0 LQNS02000001.1:55-552 214 28M * 0 0 TCTAGTTCAACTGTAAATCATCCTGCCC AAFFFJJJJJJJJJJJJJJJJJJJJJJJ AS:i:-6 XS:i:-6 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:9T18 YT:Z:UU
K00311:84:HYCNTBBXX:1:1123:2909:4215 0 LQNS02000001.1:55-552 214 28M * 0 0 TCTAGTTCAACTGTAAATCATCCTGCCC AAFFFJJJJJJJJJJJJJJJJJJJJJJJ AS:i:-6 XS:i:-6 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:9T18 YT:Z:UU
K00311:84:HYCNTBBXX:1:1123:2909:4215 0 LQNS02000001.1:55-552 214 28M * 0 0 TCTAGTTCAACTGTAAATCATCCTGCCC AAFFFJJJJJJJJJJJJJJJJJJJJJJJ AS:i:-6 XS:i:-6 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:9T18 YT:Z:UU
K00311:84:HYCNTBBXX:1:1123:2909:4215 0 LQNS02000001.1:55-552 214 28M * 0 0 TCTAGTTCAACTGTAAATCATCCTGCCC AAFFFJJJJJJJJJJJJJJJJJJJJJJJ AS:i:-6 XS:i:-6 XN:i:0 XM:i:1 XO:i:0 XG:i:0 NM:i:1 MD:Z:9T18 YT:Z:UU
我还有另一个文件,其中包含我感兴趣的 ID,如下所示:
K00311:84:HYCNTBBXX:1:2223:15798:5692
K00311:84:HYCNTBBXX:2:2211:11414:30696
K00311:84:HYCNTBBXX:2:2223:28879:41581
理想情况下,我想从 bam 文件中提取以 ID 文件中的 ID 开头的行。目前我正在使用我编写的这段代码,但它不起作用。任何帮助将不胜感激!谢谢
import pysam
import re
forward = pysam.AlignmentFile('pingpon.forward.bam', "rb")
reverse = pysam.AlignmentFile('pingpon.reverse.bam', "rb")
ids = open("IDs_results_bed_reverse.txt", "w")
for line in reverse:
if re.match("(.*)(I|i)ds(.*)", line):
print(line)