我想用 NLTK 和 wordnet 来理解两个词之间的语义关系。就像我输入“员工”和“服务员”一样,它返回的内容表明员工比服务员更一般。或者对于“employee”和“worker”,它返回相等。有谁知道这是怎么做到的吗?
问问题
1478 次
1 回答
6
首先,您必须解决将单词放入引理然后放入同义词集的问题,即如何从一个单词中识别同义词集?
word => lemma => lemma.pos.sense => synset
Waiters => waiter => 'waiter.n.01' => wn.Synset('waiter.n.01')
所以假设您已经处理了上述问题并得出了 的最右边的表示waiter
,那么您可以继续比较同义词集。请注意,一个词可以有很多同义词
from nltk.corpus import wordnet as wn
waiter = wn.Synset('waiter.n.01')
employee = wn.Synset('employee.n.01')
all_hyponyms_of_waiter = list(set([w.replace("_"," ") for s in waiter.closure(lambda s:s.hyponyms()) for w in s.lemma_names]))
all_hyponyms_of_employee = list(set([w.replace("_"," ") for s in employee.closure(lambda s:s.hyponyms()) for w in s.lemma_names]))
if 'waiter' in all_hyponyms_of_employee:
print 'employee more general than waiter'
elif 'employee' in all_hyponyms_of_waiter:
print 'waiter more general than employee'
else:
print "The SUMO ontology used in wordnet just doesn't have employee or waiter under the same tree"
于 2013-03-27T04:08:20.490 回答