python - 如何在 Python 中使用这些信息？我不知道如何使用这种数据类型

Question

根据 NLTK 书，我首先应用语法，然后解析它。

grammar = r"""
            NP: {<DT|PP\$>?<JJ>*<NN>}
                {<NNP>+}
                """
cp = nltk.RegexpParser(grammar)
chunked_sent =  cp.parse(sentence)

当我打印 chunked_sent时，我得到了这个：

(S
  i/PRP
  use/VBP
  to/TO
  work/VB
  with/IN
  you/PRP
  at/IN
  (NP match/NN)
  ./.)

我不想只看它。我想真正抽出“NP”名词短语。

我怎样才能打印出“匹配”......这是名词短语？我想从该 chunked_sent 中获取所有“NP”。

for k in chunked_sents:
    print k

(u'i', 'PRP')
(u'use', 'VBP')
(u'to', 'TO')
(u'work', 'VB')
(u'with', 'IN')
(u'you', 'PRP')
(u'at', 'IN')
(NP match/NN)
(u'.', '.')


for k in chunked_sents:
    print k[0]

i
use
to
work
with
you
at
(u'match', 'NN')

看，出于某种原因，我失去了“NP”。
另外，我如何确定 k[0] 是字符串还是元组（如上例所示）

score 0 · Accepted Answer

那么你可能已经找到了答案。我将它发布给将来可能面临这种情况的人。

for subtree in chunked_sent.subtrees():
    if subtree.node == 'NP': print subtree

python - 如何在 Python 中使用这些信息？我不知道如何使用这种数据类型

1 回答 1

Related

Reference