看来这是我今天的第 1000 个问题 :) 我已经接近完成我的语法了,如果前缀和中缀运算符共享相同的符号,我就会遇到问题。我正在解析一种称为 MathML 的标记语言...
grammar MathMLOperators;
options
{
output = AST;
backtrack = true;
memoize = true;
}
tokens
{
DOCUMENT; // The root of the parsed document.
GROUP;
OP; // any operator
PREFIX_OP; // a prefix operator.
INFIX_OP; // an infix operator.
POSTFIX_OP; // a postfix operator.
NON_INFIX_OP; // a non-infix operator
}
// Start rule.
public document : math+ -> ^(DOCUMENT math+);
inFixTag : TAG_START_OPEN MO TAG_CLOSE ('-' | '+' | '=') TAG_END_OPEN MO TAG_CLOSE -> ^(INFIX_OP);
preFixTag : TAG_START_OPEN MO TAG_CLOSE ('+' | '-') TAG_END_OPEN MO TAG_CLOSE -> ^(PREFIX_OP);
// Use semantic predicate to only allow postfix expressions when at the end of an mrow.
postFixTag : TAG_START_OPEN MO TAG_CLOSE ('!' | '^') TAG_END_OPEN MO {input.LT(1).getType() == TAG_CLOSE && input.LT(2).getType() == TAG_END_OPEN && input.LT(3).getType() == MROW && input.LT(4).getType() == TAG_CLOSE}? TAG_CLOSE -> ^(POSTFIX_OP);
nonInfixTag : TAG_START_OPEN MO TAG_CLOSE ('!' | '^') TAG_END_OPEN MO TAG_CLOSE {$expressionList::count++;} -> ^(OP);
opTag: TAG_START_OPEN MO TAG_CLOSE ('-' | '+' | '^' |'=') TAG_END_OPEN MO TAG_CLOSE -> ^(NON_INFIX_OP);
//Expressions
infixExpression: grouping (inFixTag^ grouping)*;
grouping : nestedExpression+ -> ^(GROUP nestedExpression+);
prefixExpression : /* check that it's the first in the mrow*/ {$expressionList::count == 0}? (preFixTag^ (primaryExpression | nonInfixTag)) {$expressionList::count++;};
postfixExpression : (primaryExpression | prefixExpression| nonInfixTag) (postFixTag^)? ;
expressionList scope {int count} @init{$expressionList::count = 0;} : (infixExpression | opTag)+;
nestedExpression : postfixExpression;
primaryExpression : mrow | mn;
math : TAG_START_OPEN root=MATH TAG_CLOSE expressionList TAG_END_OPEN MATH TAG_CLOSE -> ^($root expressionList);
mrow : TAG_START_OPEN root=MROW TAG_CLOSE expressionList? TAG_END_OPEN MROW TAG_CLOSE -> ^($root expressionList?);
mn: TAG_START_OPEN root=MN TAG_CLOSE INT TAG_END_OPEN MN TAG_CLOSE -> ^($root INT);
MATH : 'math'; // root tag
MROW : 'mrow'; // row
MO : 'mo'; // operator
MN : 'mn'; // number
TAG_START_OPEN : '<';
TAG_END_OPEN : '</' ;
TAG_CLOSE : '>';
TAG_EMPTY_CLOSE : '/>';
INT : '0'..'9'+;
WS : (' '|'\r'|'\t'|'\u000C'|'\n') {$channel=HIDDEN;};
这将工作正常...
<math>
<mrow>
<mo>-</mo>
<mn>7</mn>
<mo>=</mo>
<mn>8</mn>
</mrow>
</math>
但这会失败...
<math>
<mrow>
<mo>-</mo>
<mn>7</mn>
<mo>-</mo>
<mn>8</mn>
</mrow>
</math>
第一个“-”应该是“前缀”,第二个应该是“中缀”。从调试器看来,该规则grouping
正在循环并且没有返回到父规则infixExpression
,即使它无法匹配。
我确定我在某个地方有一个错误的 EBNF 运算符,但我不知道是哪一个。我尝试遵循 C 等语言中的标准表达式嵌套模式,但这是一种不常见的解析语言。