python - Python如何将文件与模板匹配

Question

我希望将许多文件与一些常见模板进行匹配，并提取差异。我想就最好的方法提出建议。例如：

模板 A：

<1000 text lines that have to match>
a=?
b=2
c=3
d=?
e=5
f=6
<more text>

模板 B：

<1000 different text lines that have to match>
h=20
i=21
j=?
<more text>
k=22
l=?
m=24
<more text>

如果我传入文件 C：

<1000 text lines that match A>
a=500
b=2
c=3
d=600
e=5
f=6
<more text>

我想用一种简单的方法说这与模板 A 匹配，并提取“a=500”、“d=600”。

我可以将这些与正则表达式匹配，但文件相当大，构建该正则表达式会很痛苦。

我也尝试过 difflib，但解析操作码和提取差异似乎不是最佳的。

有人有更好的建议吗？

score 3 · Accepted Answer

您可能需要稍微调整一下以处理附加文本，因为我不知道确切的格式，但应该不会太难。

with open('templ.txt') as templ, open('in.txt') as f:
    items = [i.strip().split('=')[0] for i in templ if '=?' in i]
    d = dict(i.strip().split('=') for i in f)
    print [(i,d[i]) for i in items if i in d]

出去：

[('a', '500'), ('d', '600')]  # With template A
[]                            # With template B

或者如果对齐：

from itertools import imap,compress
with open('templ.txt') as templ, open('in.txt') as f:
    print list(imap(str.strip,compress(f,imap(lambda x: '=?' in x,templ))))

出去：

['a=500', 'd=600']

score 0 · Accepted Answer

不考虑性能：

将所有内容加载到字典中，以便您拥有例如A = {'a': '?', 'b': 2, ...}, B = {'h': 20, 'i': 21, ...},C = {'a': 500, 'b': 2, ...}
如果A.keys() == C.keys()你知道 C 匹配 A。
然后简单地区分两个字典。

根据需要改进。

python - Python如何将文件与模板匹配

2 回答 2

Related

Reference