我有两个文本文件。两者的内容相同,但格式不同。在一个文件中,单词或字母之间有多余的空格。也有不同的换行符。例如:
文件1:
The annotation framework we presented is
embedded in the Knowledge Management and
Acquisition Platform Semantic Turkey (Pazienza, et
al., 2012), and comes out-the-box with a few
annotation families which differ in the underlying
annotation model and, notably, in the tasks they
support. The default handlers take into consideration
the annotation of atomic ontological resources, and
complex activities that are provided as macros, e.g.
the creation of new instances, the definition of new
subclasses in OWL, or of narrower concepts in
SKOS.
文件2:
Theannotationframework we presented is
embedded in th e K n o w l e d ge Management and
Acquisition Platform Semantic Turkey (Pazienza, et
al., 2012), and comes out-the-
box with a few
annotation families which differ in the underlying
annotation model and, notably, in the tasks they
support. The default handlers take into consideration
the a n n o t a t i o n o f a t o m i c ontological resources, and
complex activities that are provided as macros, e.g.
the creation of new instances, the definition of new
subclasses in OWL, or of narrower concepts in
SKOS.
假设我the Knowledge Management
从 File1 中选择了字符串,并且我想将它与th e K n o w l e d ge Management
File2 中的字符串匹配。
我怎样才能实现它?第二个文件中没有固定的畸形。唯一可以确定的是,两个文件中的字符顺序相同,它们可能被额外的空格分隔,或者它们之间的空格可能丢失。
我想应用卖方算法或维特比算法,但我不确定。近似字符串匹配也可能很昂贵。
任何线索都会有所帮助。非常感谢!