java - 如何打印依赖关系图的一部分

Question

我想打印依赖图的子树。特别是对于句子“ I turn the red meat”和起始词meat-NN，输出应该是：“ the red meat”。

现在我正在这样做：

protected String printSubGraph(IndexedWord startingWord, SemanticGraph graph) {
    Iterable<SemanticGraphEdge> outiter = graph.outgoingEdgeIterable(startingWord);

    // set the default bounds to the startingWord 
    int start = startingWord.beginPosition();
    int end = startingWord.endPosition();

    // search the next level for larger bounds
    // assume that everything in between the bounds belongs to the sub-graph of the startingWord
    for (SemanticGraphEdge edge : outiter) {
        start = Math.min(start, edge.getGovernor().beginPosition());
        start = Math.min(start, edge.getDependent().beginPosition());
        end = Math.max(end, edge.getGovernor().endPosition());
        end = Math.max(end, edge.getDependent().endPosition());
    }

    return graph.toRecoveredSentenceString().substring(start, end);
}

这很糟糕，原因有三个：

我假设标记之间的所有内容都属于起始单词的子树。
我不会在整个子树中搜索更大的界限。
我假设图表是整个文本，并且边界对 RecoveredSentenceString 有效。（如果原文包含多个句子，则不正确。）

有没有办法在不自己实现 DFS 的情况下从 SemanticGraph 或 CoreMap 获取这个子树（并且只有这个子树）？我知道另一种方式，但我不知道有什么方法可以在树中找到 IndexedWord。

score 1 · Accepted Answer

也许您正在寻找的不是依赖解析，而是短语结构解析。

你的句子是：

我把红肉翻了。

其中的短语结构解析为：

(ROOT(S(NP(PRP I))(VP(VBP转)(NP(DT)(JJ红)(NN肉))(..)))

您可以编写以下形式的TregexPattern：

NP<（NN<肉）

获得所需的子树或简单地

NP

获取所有名词短语。

java - 如何打印依赖关系图的一部分

1 回答 1

Related

Reference