0

我有一个文件(usearch.txt),其中的条目如下所示:

0 AM158981
0 AM158980
0 AM158982
等。

我想将此文件(AM158981 等)中的入藏号替换为与之对应的细菌名称,这些名称位于第二个文件(acs.txt)中:

AM158981 Brucella pinnipedialis Brucellaceae
AM158980 Brucella suis Brucellaceae
AM158982 Brucella ceti Brucellaceae
etc.

我的计划是使用第二个文件(入藏号作为键,名称作为值)制作字典,然后打开第一个文件并使用字典替换入藏号并将其保存到新文件(done.txt ):

#! /usr/bin/env python
import re
# Creates a dictionary for accession numbers

fname = r"acs.txt"

namer = {}
for line in open(fname):
        acs, name = line.split(" ",1)
        namer[acs] = str(name)

infilename = "usearch.txt"
outfilename = "done.txt"

regex = re.compile(r'\d+\s(\w+)')

with open(infilename, 'r') as infile, open(outfilename, 'w') as outfile:
    for line in infile:
        x = regex.sub(r'\1', namer(name), line)

        outfile.write(x) 

运行此脚本时出现此错误: Traceback(最近一次调用最后一次):

  File "nameit.py", line 21, in <module>
  x = regex.sub(r'\1', namer(name), line)
  TypeError: 'dict' object is not callable

理想情况下,我的“done.txt”文件应该是这样的:

0 Brucella pinnipedialis Brucellaceae
0 Brucella suis Brucellaceae
0 Brucella ceti Brucellaceae

4

1 回答 1

1

您正在尝试使用namer类似的方法:

x = regex.sub(r'\1', namer(name), line)

您想用括号替换括号以使用 key 访问元素name

x = regex.sub(r'\1', namer[name], line)

请注意,您还需要再次获取名称,否则您将一遍又一遍地使用相同的密钥:

with open(infilename, 'r') as infile, open(outfilename, 'w') as outfile:
    for line in infile:
        # Need to get the ID for the bacteria in question. If we don't, everything
        # will end up with the same name in our output file.
        _, name = line.split(" ", 1)

        # Strip the newline character
        name = name.strip()

        x = regex.sub(r'\1', namer[name], line)
        outfile.write(x) 
于 2013-05-30T17:59:35.713 回答