python - 用 nltk 分块

Question

如何从给定模式的句子中获取所有块。示例

NP:{<NN><NN>}

标记的句子：

[("money", "NN"), ("market", "NN") ("fund", "NN")]

如果我解析我得到

(S (NP money/NN market/NN) fund/NN)

我还想有另一种选择

(S money/NN (NP market/NN fund/NN))

score 6 · Accepted Answer

@matchkarov 关于 nbest_parse 文档是正确的。为了代码示例，请参见：

import nltk
# Define the cfg grammar.
grammar = nltk.parse_cfg("""
S -> NP
S -> NN NP
S -> NP NN
NP -> NN NN
NN -> 'market'
NN -> 'money'
NN -> 'fund'
""")

# Make your string into a list of tokens.
sentence = "money market fund".split(" ")

# Load the grammar into the ChartParser.
cp = nltk.ChartParser(grammar)

# Generate and print the nbest_parse from the grammar given the sentence tokens.
for tree in cp.nbest_parse(sentence):
    print tree

score 1 · Accepted Answer

我认为您的问题是关于获得n最有可能的句子解析。我对吗？如果是，请参阅2.0 文档nbest_parse(sent, n=None)中的功能。

python - 用 nltk 分块

2 回答 2

Related

Reference