0

我有一个文本文件,其中只有 35 个字符串我想在文本文件中找出最相关的字符串。如何实现 BM25F、VSM 或 POS 来找到它?

例如

Panoramio Bahawalpur
... - Bahawalpur - Picture of Bahawalpur, Punjab Province - TripAdvisor
... Minister Syed Yousaf Raza Gillani\u00e2\u20ac\u2122s short visit to 
Bahawalpur
Bahawalpur Station Pictures - Pakistan in Photos
Noor Mahal Station , Bahawalpur Railway Station | Noor Mahal the italian style palac ...
Bahawalpur Railway Pakistan
Nur Mehal, Bahawalpur  

给定的输入是Bahawalpur 火车站

如何找到最合适/相关的字符串?

4

1 回答 1

0

这是您可以完成的非常简单的任务

from difflib import SequenceMatcher

它会返回你的字符串匹配的百分比

def similar(a, b):

  return SequenceMatcher(None, a, b).ratio()
str = "This is hello-hi image"

print "The score of relevancy is :", similar("Hello",str) * 100 ,""

您可以根据自己的要求更改结果。谢谢

于 2017-06-17T09:06:27.177 回答