目的是如果给定句子中存在“nsubj”,则从句子中提取子树(短语)。
这是我正在使用的代码:
import spacy
nlp = spacy.load('en')
piano_doc = nlp('The alarm clock is, to many high school students, a wailing monstrosity whose purpose is to torture all who are sleep-deprived')
for token in piano_doc:
if token.dep_ == 'nsubj':
print (token.text, token.tag_, token.head.text, token.dep_)
subtree = token.subtree
print([(t.text) for t in subtree])
print('*' * 50)
我们得到的输出是:clock NN is nsubj
['闹钟']
目的NN是nsubj
['谁的','目的']
谁 WP 是 nsubj
['谁']
但是在 nsubj 的情况下,我期望的输出是整个子树,即
目的NN是nsubj
['谁的','目的','是','要','折磨']
谁 WP 是 nsubj
['谁','是','睡眠不足']