0

嗨,我正在尝试在文件中搜索特定的单词列表。如果找到其中一个词,我想在下面添加一个换行符并添加这个短语 \colour = 1(我不想删除我正在搜索的原始词)。

An extract of the file for context and format:
LOCUS       contig_2_pilon_pilon 5558986 bp    DNA     linear   BCT 16-JUN-2020
DEFINITION  Escherichia coli O157:H7 strain (270078)
ACCESSION   
VERSION
KEYWORDS    .
SOURCE      Escherichia coli 270078
  ORGANISM  Escherichia coli 270078
            Bacteria; Proteobacteria; gamma subdivision; Enterobacteriaceae;
            Escherichia.
COMMENT     Annotated using prokka 1.14.6 from
            https://github.com/tseemann/prokka.
FEATURES             Location/Qualifiers
     source          1..5558986
                     /organism="Escherichia coli 270078"
                     /mol_type="genomic DNA"
                     /strain="strain"
                     /db_xref="taxon:562"
     CDS             61523..61744
                     /gene="pspD"
                     /locus_tag="JCCJNNLA_00057"
                     /inference="ab initio prediction:Prodigal:002006"
                     /inference="similar to AA sequence:RefSeq:EG10779-MONOMER"
                     /codon_start=1
                     /transl_table=11
                     /product="peripheral inner membrane heat-shock protein"
                     /translation="MNTRWQQAGQKVKPGFKLAGKLVLLTALRYGPAGVAGWAIKSVA
                     RRPLKMLLAVALEPLLSRAANKLAQRYKR"

这是我在整个文件中寻找的单词列表之一:

regulation_list=["anti-repressor","anti-termination","antirepressor","antitermination","antiterminator","anti-terminator","cold-shock","cold shock","heat-shock","heat shock","regulation","regulator","regulatory","helicase","antibiotic resistance","repressor","zinc","sensor","dipeptidase","deacetylase","5-dehydrogenase","glucosamine kinase","glucosamine-kinase","dna-binding","dna binding","methylase","sulfurtransferase","acetyltransferase","control","ATP-binding","ATP binding","Cro","Ren protein","CII","inhibitor","activator","derepression","protein Sxy","sensing","sensor","Tir chaperone","Tir-cytoskeleton","Tir cytoskeleton","Tir protein","EspD"]

如您所见,extract 包含我正在寻找的短语之一,我想在下面添加一个带有短语的换行符/colour = 1

任何帮助都会很棒!

4

1 回答 1

0
# Create simple input file for testing:
cat > foo.txt <<EOF
foo
foo anti-termination
bar anti-repressor anti-termination
baz
EOF

python -c '
import re

# Using a shortened version of your list:
regulation_list=["anti-repressor", "anti-termination", "etc"]

# For speed and simplicity, compile the regular expression once, the reuse it later:
regulation_re = re.compile("|".join(regulation_list))

with open("foo.txt" , "r") as in_file:
    for line in in_file:
        line = line.strip()
        print(line)
        if re.search(regulation_re, line):
           print("/colour = 1")
' > bar.txt

cat bar.txt

印刷:

foo
foo anti-termination
/colour = 1
bar anti-repressor anti-termination
/colour = 1
baz

您可能希望在字符串中添加额外的换行符和额外的空格以/colour=1进行对齐(您的问题并不清楚),如下所示:

print("\n                     /colour = 1")
于 2020-08-21T15:59:57.057 回答