您可以创建str
该哈希的子类并进行比较,就好像它仅包含其 ID:
import re
class IdString(str):
"""A string that hashes and compares on its id.
>>> hash(IdString('XXX ID A XXX')) == hash('A')
True
>>> hash(IdString('XXX ID abc XXX')) == hash('abc')
True
>>> IdString('XXX ID A XXX') == IdString('YYY ID A YYY')
True
>>> IdString('XXX ID A XXX') == IdString('XXX ID B XXX')
False
"""
def __new__(cls, *args):
self = super(IdString, cls).__new__(cls, *args)
m = re.search(r'\bID (\w+)', self)
self.id = m.group(1)
return self
def __hash__(self):
return hash(self.id)
def __eq__(self, other):
return self.id == other.id
def __ne__(self, other):
return self.id != other.id
然后你可以把你的普通字符串变成IdString
对象并将它们传递给difflib
,如下所示:
from difflib import unified_diff
text1 = '''T0 ID A
T1 ID B
T2 ID C
T4 ID D
'''
text2 = '''T5 ID A
T6 ID E
T7 ID F
T8 ID D
'''
print(''.join(unified_diff(map(IdString, text1.splitlines(True)),
map(IdString, text2.splitlines(True)),
n=0)))
这几乎产生了您想要的输出:
---
+++
@@ -2,2 +2,2 @@
-T1 ID B
-T2 ID C
+T6 ID E
+T7 ID F
(您问题中的示例说@-1,2 +1,2
,但我无法准确重现,因为我不知道 diff 是什么风格,并且行号在 diff 输出中从 1 开始。)