scala - Antlr 左递归

Question

我正在尝试将 EBNF 形式的 scala 中的后缀、中缀和前缀规则转换为 ANTLR，但在中缀表达式规则上看到与左递归有关的错误。

有问题的规则是：

public symbolOrID
:   ID
|   Symbol
;

public postfixExpression
:   infixExpression symbolOrID? -> ^(R__PostfixExpression infixExpression symbolOrID?)
;

public infixExpression
:   prefixExpression
|   infixExpression (symbolOrID infixExpression)? -> ^(R__InfixExpression infixExpression symbolOrID? infixExpression?)
;

public prefixExpression
:   prefixCharacter? simpleExpression -> ^(R__PrefixExpression prefixCharacter? simpleExpression)
;

public prefixCharacter
:   '-' | '+' | '~' | '!' | '#'
;

public simpleExpression
:   constant
;

如果我将 infixExpression 规则更改为：

public infixExpression
:   prefixExpression (symbolOrID infixExpression)? -> ^(R__InfixExpression prefixExpression symbolOrID? infixExpression?)
;

然后它反而抱怨：

warning(200): Hydra.g3:108:26: Decision can match input such as "{ID, Symbol} {'!'..'#', '+', '-', '~'} String" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(200): Hydra.g3:108:26: Decision can match input such as "{ID, Symbol} {'!'..'#', '+', '-', '~'} Number" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(200): Hydra.g3:108:26: Decision can match input such as "{ID, Symbol} {'!'..'#', '+', '-', '~'} Boolean" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(200): Hydra.g3:108:26: Decision can match input such as "{ID, Symbol} {'!'..'#', '+', '-', '~'} Regex" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(200): Hydra.g3:108:26: Decision can match input such as "{ID, Symbol} {'!'..'#', '+', '-', '~'} Null" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input

最后，有没有办法有条件地在 AST 中创建节点，这样如果只有规则的左侧部分为真，那么它就不会添加该级别？例如：

conditional_or_expression:
    conditional_and_expression  ('||' conditional_or_expression)?
;

例如，假设我创建了遵循如下层次结构的语法：

conditional_and_expression
  conditional_or_expression
    null_coalescing_expression

如果解析的表达式是a || b，则当前为该表达式创建的 AST 将是

conditional_and_expression
  conditional_or_expression

我怎么能得到它，所以它只是得到conditional_or_expression部分？

在 JavaCC 中，您可以只设置节点数量，例如：#ConditionalOrExpression(>1)

编辑：昨晚有点晚了，现在修改了中缀表达式！

最终编辑：我最终让它工作的方式是以下规则：

public symbolOrID
:   ID
|   Symbol
;

public postfixExpression
:   infixExpression (symbolOrID^)?
;

public infixExpression
:   (prefixExpression symbolOrID)=> prefixExpression symbolOrID^ infixExpression
|   prefixExpression
;

public prefixExpression
:   prefixCharacter^ simpleExpression
|   simpleExpression
;

public prefixCharacter
:   '-' | '+' | '~' | '!' | '#'
;

public simpleExpression
:   constant
;

score 1 · Accepted Answer

黑暗泽拉斯写道：

我正在尝试将 EBNF 形式的 scala 的后缀、中缀和前缀规则转换为 ANTLR，但我看到与左递归有关的错误

正如我在评论中所说：您发布的规则中没有左递归。

黑暗泽拉斯写道：

我怎么能得到它，所以它只得到 conditional_or_expression 部分？

我假设您使用的是 ANTLRWorks 的解释器或调试器，在这种情况下，树：

conditional_and_expression
            \
  conditional_or_expression

只是这样显示（显示解析树，而不是 AST）。如果您正确地将您的orExpression转换为 AST，则表达式a || b将变为：

  ||
 /  \
a    b

（即||作为根节点和a子b节点）

例如，采用以下语法：

grammar T;

options {
  output=AST;
}

parse
  :  expr EOF -> expr
  ;

expr
  :  or_expr
  ;

or_expr
  :  and_expr ('||'^ and_expr)*
  ;

and_expr
  :  add_expr ('&&'^ add_expr)*
  ;

add_expr
  :  atom (('+' | '-')^ atom)*
  ;

atom
  :  NUMBER
  |  '(' expr ')' -> expr
  ;

NUMBER : '0'..'9'+;

如果您现在12+34使用从上述语法生成的解析器进行解析，ANTLRWorks（或 Eclipse ANTLR IDE）将显示以下解析树：

在此处输入图像描述

但这不是解析器创建的 AST。AST 实际上看起来像：

在此处输入图像描述

（即or_expr，and_expr“层”不在那里）

黑暗泽拉斯写道：

不幸的是，这对于语言来说是一个相当关键但处于早期阶段的阶段，所以我不得不对语法的全部细节保密。

没问题，但您必须意识到，如果您隐瞒重要信息，人们将无法正确回答您的问题。您不需要发布整个语法，但如果您需要左递归方面的帮助，您必须发布实际上导致您提到的错误的（部分）语法。如果我不能复制它，它就不存在！:)

score 0 · Accepted Answer

本次制作：

infixExpr ::= PrefixExpr
            | InfixExpr id [nl] InfixExpr

可以改写为

infixExpr ::= PrefixExpr
            | PrefixExpr id [nl] InfixExpr

事实上，我敢打赌这只是语法错误。让我们举一个例子，它是好的。让我们用第一个语法减少（部分）一些东西，然后尝试第二个。

InfixExpr id [nl] InfixExpr                      
// Apply the second reduction to the first InfixExpr
InfixExpr id [nl] InfixExpr id [nl] InfixExpr
// Apply the first reduction to the (new) first InfixExpr
PrefixExpr id [nl] InfixExpr id [nl] InfixExpr
// Apply the first reduction to the new first InfixExpr
PrefixExpr id [nl] PrefixExpr id [nl] InfixExpr
// Apply the first reduction to the new first InfixExpr
PrefixExpr id [nl] PrefixExpr id [nl] PrefixExpr

让我们用第二种语法来减少它：

PrefixExpr id [nl] InfixExpr                      
// Apply the second reduction to the first InfixExpr
PrefixExpr id [nl] PrefixExpr id [nl] InfixExpr
// Apply the first reduction to the new first InfixExpr
PrefixExpr id [nl] PrefixExpr id [nl] PrefixExpr

如您所见，在这两种情况下，您都以等效的 AST 结束。

scala - Antlr 左递归

2 回答 2

Related

Reference