from difflib import get_close_matches
mystring="walk walked walking talk talking talks talked fly flying"
list_of_words=["walk","talk","fly"]
sp = mystring.split()
for x in list_of_words:
li = [y for y in get_close_matches(x,sp,cutoff=0.5) if x in y]
print '%-7s %d in %-10s' % (x,len(li),li)
结果
walk 2 in ['walk', 'walked']
talk 3 in ['talk', 'talks', 'talked']
fly 2 in ['fly', 'flying']
截止值指的是与计算相同的比率SequenceMatcher
:
from difflib import SequenceMatcher
sq = SequenceMatcher(None)
for x in list_of_words:
for w in sp:
sq.set_seqs(x,w)
print '%-7s %-10s %f' % (x,w,sq.ratio())
结果
walk walk 1.000000
walk walked 0.800000
walk walking 0.727273
walk talk 0.750000
walk talking 0.545455
walk talks 0.666667
walk talked 0.600000
walk fly 0.285714
walk flying 0.200000
talk walk 0.750000
talk walked 0.600000
talk walking 0.545455
talk talk 1.000000
talk talking 0.727273
talk talks 0.888889
talk talked 0.800000
talk fly 0.285714
talk flying 0.200000
fly walk 0.285714
fly walked 0.222222
fly walking 0.200000
fly talk 0.285714
fly talking 0.200000
fly talks 0.250000
fly talked 0.222222
fly fly 1.000000
fly flying 0.666667