当 GLR 解析器以两种或多种方式将某些文本减少到相同的非终结符时,它会合并解析子树。Rekers 为此使用“符号节点”。
我这不是每个非终端都可能导致合并。提前知道哪些非终结符永远不会合并将大大简化解析树的构建。
例如,在 Elkhound 技术报告中,作者为 GLR 解析器实现了 C++ 语法。他是这样描述的:
该语法目前有 37 个移位/归约冲突、47 个归约/归约冲突和8 个模棱两可的非终结符。
如何区分给定 CFG 的模棱两可和明确的非终结符?我在哪里可以读到这个?
当 GLR 解析器以两种或多种方式将某些文本减少到相同的非终结符时,它会合并解析子树。Rekers 为此使用“符号节点”。
我这不是每个非终端都可能导致合并。提前知道哪些非终结符永远不会合并将大大简化解析树的构建。
例如,在 Elkhound 技术报告中,作者为 GLR 解析器实现了 C++ 语法。他是这样描述的:
该语法目前有 37 个移位/归约冲突、47 个归约/归约冲突和8 个模棱两可的非终结符。
如何区分给定 CFG 的模棱两可和明确的非终结符?我在哪里可以读到这个?
How does knowing which non-terminals can produce an ambiguity simplify parse tree construction in a practical way?
If you have NO such nonterminals in a grammar, then you can leave out the symbol node construction and subtree sharing machinery, true. It isn't that much code, so this win is pretty minor.
But if any nonterminal can be ambiguous, then you need the machinery. Sharing that machinery across all ambiguities is pretty easy (I've build GLR parsers). So what exactly do you gain?
Lastly, you can't in general (theorem) determine if a grammar is ambiguous. Since a nonterminal represents a sub-grammar, you can't in general determine if any specific nonterminal is ambiguous. You can do this for lots of special cases. (My GLR parser generator in fact will find some ambiguities automatically basically by enumerating possible expansions of nonterminals).