3

有冲突的语法的精简版本:

body: variable_list function_list;
variable_list:
  variable_list variable | /* empty */
;
variable:
  TYPE identifiers ';'
;
identifiers:
  identifiers ',' IDENTIFIER | IDENTIFIER
;
function_list:
  function_list function | /* empty */
;
function:
  TYPE IDENTIFIER '(' argument_list ')' function_body
;

问题是变量和函数都以 TYPE 和 IDENTIFIER 开头,例如

int some_var;
int foo() { return 0; }

在这种语言中,变量总是在函数之前声明,但是当尝试解析时,它总是给出

解析错误:语法错误,意外'(',期待','或';' [在foo之后]

如何使 variable_list 不那么贪婪,或者让解析器意识到如果下一个标记是 '(' 而不是 ';' 或 '',那么它显然是一个函数而不是变量声明?

冲突的野牛调试输出是

state 17

3 body: variable_list . function_list
27 variable_list: variable_list . variable

T_INT    shift, and go to state 27
T_BOOL   shift, and go to state 28
T_STR    shift, and go to state 29
T_VOID   shift, and go to state 30
T_TUPLE  shift, and go to state 31

T_INT     [reduce using rule 39 (function_list)]
T_BOOL    [reduce using rule 39 (function_list)]
T_STR     [reduce using rule 39 (function_list)]
T_VOID    [reduce using rule 39 (function_list)]
T_TUPLE   [reduce using rule 39 (function_list)]
$default  reduce using rule 39 (function_list)

variable       go to state 32
simpletype     go to state 33
type           go to state 34
function_list  go to state 35

我已经尝试了各种 %prec 语句以使其更喜欢 reduce(尽管我不确定在这种情况下会有什么区别),但没有成功让野牛使用 reduce 来解决这个问题,我也尝试过改组规则围绕制定诸如 non_empty_var_list 之类的新规则并将正文拆分为 function_list | non_empty_var_list function_list 并且没有任何尝试可以解决此问题。我是新手,我已经没有如何解决这个问题的想法了,所以我完全感到困惑。

4

1 回答 1

8

问题在于变量和函数都以 TYPE 和 IDENTIFIER 开头

不完全是。问题是 function_list 是左递归的并且可能是空的。

When you reach the semi-colon terminating a variable with TYPE in the lookahead, the parser can reduce the variable into a variable_list, as per the first variable_list production. Now the next thing might be function_list, and function_list is allowed to be empty. So it could do an empty reduction to a function_list, which is what would be necessary to start parsing a function. It can't know not to do that until it looks at the '(' which is the third next token. That's far too far away to be relevant.

Here's an easy solution:

function_list: function function_list
             | /* EMPTY */
             ;

Another solution is to make function_list non-optional:

body: variable_list function_list
    | variable_list
    ;

function_list: function_list function
             | function
             ;

If you do that, bison can shift the TYPE token without having to decide whether it's the start of a variable or function definition.

于 2012-10-29T03:51:45.537 回答