antlr - ANTLR 语法令牌问题（ANTLR 有效）

Question

我是 ANTLR 的业余爱好者，我正在为一个简单的处理器创建一个解释器，我遇到了一个 VALUE 令牌抛出错误的小问题。我是一名学生，所以我不是要你为我做作业......我已经完成了它（包括口译员的所有课程文件）但是这个问题正在打败我，即使它可能很简单并盯着我的脸。

ANTLR works 不断给我这个控制台错误信息；

“错误（208）：newExpr.g:193:1：以下标记定义永远无法匹配，因为先前的标记匹配相同的输入：VALUE”

显然，VALUE 的正则表达式有问题，但我看不到它是什么，无论是那里还是语法中的其他任何地方。如果您能指出我遗漏的内容，将不胜感激……因为谷歌搜索并没有真正帮助我找到自己语法中的错误。

grammar newExpr;

options 
{
    language=Java;
}

@header 
{
    import java.util.*;
}

@members 
{
    ArrayList myInitialise = new ArrayList();
    ArrayList InstructionList = new ArrayList();
}

/*--------------------------------------------------------------------------------------------------------------------------------*
 * PARSER RULES                                                                                                                   *
 *--------------------------------------------------------------------------------------------------------------------------------*//

/*
* prog is where the interpretation beings and consists of one or more (+) 'stat' rules
*/
prog        :       stat+;

/*
* stat rules are the general parse rules of entire operations on the processor.
* They consist of smaller data operations rules (dataop) or memory operations (memop).
*/                
stat        :       BASIC r1=REG c1=COMMA r2=REG c2=COMMA dataop NEWLINE
            {
                int reg1 = Integer.parseInt($r1.text.substring(1));  // these lines convert the token input stream and converts to an actual integer
                int reg2 = Integer.parseInt($r2.text.substring(1)); 
                int IMDT = $dataop.value;    // take the immediate integer

                // LOAD operation
                if($BASIC.text.equals("LD"))
                InstructionList.add(new ld(reg1, reg2, IMDT));

                // STORE operation  
                else if($BASIC.text.equals("ST"))
                InstructionList.add(new st(reg1, reg2, IMDT));

                // SUBTRACTION operation    
                else if($BASIC.text.equals("SUB"))
                InstructionList.add(new sub(reg1, reg2, IMDT));

                // ADDITION operation   
                else if($BASIC.text.equals("ADD"))
                InstructionList.add(new add(reg1, reg2, IMDT));

                // MULTIPLICATION operation 
                else if($BASIC.text.equals("MUL"))
                InstructionList.add(new mul(reg1, reg2, IMDT));

                // DIVISION operation   
                else if($BASIC.text.equals("DIV"))
                InstructionList.add(new div(reg1, reg2, IMDT));
            }

            | 

            i1 = INDEX '=' memop NEWLINE
            {
                myInitialise.add(new memInit(Integer.parseInt($i1.text), $dataop.value));
            }

            |

            JUMP REG COMMA dataop NEWLINE
            {
                int R = Integer.parseInt($REG.text.substring(1));
                int val = $dataop.value;

                // BRANCH EQUAL operation
                if($JUMP.text.equals("BEZ"))
                InstructionList.add(new branchEqualZero(R,value));

                // BRANCH NOT EQUAL operation
                else if($JUMP.text.equals("BNEZ"))
                InstructionList.add(new branchNotEqualZero(R,value));
            }

            | 

            JUMP REG NEWLINE
            {
                int R = Integer.parseInt($REG.text.substring(1));
                InstructionList.add(new jump(R));
            }

            | 

            HALT 
            {
                InstructionList.add(new halt());
            }
            ;


dataop returns [int value] 

        :   INDEX
            {
                $value = Integer.parseInt($INDEX.text);
            }

            |   

            VALUE
            {
                $value = Integer.parseInt($VALUE.text.substring(1))*-1;
            };


memop returns [int value]

        :   INDEX
            {
                $value = Integer.parseInt($INDEX.text);
            }

            |

            VALUE
            {
                $value = Integer.parseInt($VALUE.text.substring(1))*-1;
            }

            |

            MEMVAL
            {
                if($MEMVAL.text.startsWith("-"))
                {
                    $value = Integer.parseInt($MEMVAL.text.substring(1))*-1;
                }
                else
                    $value = Integer.parseInt($MEMVAL.text);
            };


/*--------------------------------------------------------------------------------------------------------------------------------*
 * LEXER RULES                                                                                                                    *
 *--------------------------------------------------------------------------------------------------------------------------------*/

/*
* RegExps for BASIC instructions (load, store, add, subtract, multiply, divide
*/
BASIC       :   ('L' 'D') | ('S' 'T') | ('A' 'D' 'D') | ('S' 'U' 'B') | ('M' 'U' 'L') | ('D' 'I' 'V');

/*
* The comma is simply for syntactic purposes, to separate data and register references
*/
COMMA       :   ',';

/*
* Regular Expressions for the processor registers R0-R31
*/
REG         :   ('R') (('0'..'9') | ('0'..'2') ('0'..'9') | ('3') ('0'..'1') );

/*
* 'Index' is the set of regular expressions matching memory locations
*/
INDEX       :       ('0'..'9')                    
            |
            ('0'..'9') ('0'..'9')
            |
            ('0'..'9') ('0'..'9') ('0'..'9')
            |
            ('0'..'9') ('0'..'9') ('0'..'9') ('0'..'9')
            |
            ('0'..'5') ('0'..'9') ('0'..'9') ('0'..'9') ('0'..'9')
            |
            ('6') ('0'..'4') ('0'..'9') ('0'..'9') ('0'..'9')
            |
            ('6') ('5') ('0'..'4') ('0'..'9') ('0'..'9')
            |
            ('6') ('5') ('5') ('0'..'2') ('0'..'9')
            |
            ('6') ('5') ('5') ('3') ('0'..'5');

/*
* Reg Exps for memory initialisation instructions
*/
MEMVAL      :   ('0'..'9')+ | '-' ('0'..'9')+;            

/*
* Simple integers for data values
*/          
VALUE       :   '-' (('0'..'9')         **PROBLEM IS HERE**
            |
            ('0'..'9') ('0'..'9')
            |
            ('0'..'9') ('0'..'9') ('0'..'9')
            |
            ('0'..'9') ('0'..'9') ('0'..'9') ('0'..'9')
            |
            ('0'..'5') ('0'..'9') ('0'..'9') ('0'..'9') ('0'..'9')
            |
            ('6') ('0'..'4') ('0'..'9') ('0'..'9') ('0'..'9')
            |
            ('6') ('5') ('0'..'4') ('0'..'9') ('0'..'9')
            |
            ('6') ('5') ('5') ('0'..'2') ('0'..'9')
            |
            ('6') ('5') ('5') ('3') ('0'..'6'));

/*
* Regular Expressions for return/newline characters
*/ 
NEWLINE     :   '\r'? '\n' ;


/*
* This simply makes the interpreter tolerant to whitespace
*/
WHITESPACE      :   (' ' | '\t' | '\u000C')+ {skip();};

/*
* RegExp for Branch on Equal to Zero/Branch on Not Equal to Zero instructions
*/
BRANCH      :   ('B' 'E' 'Z') | ('B' 'N' 'E' 'Z');

/*
* RegExp for jump instruction
*/
JUMP        :   ('J' 'R');

/*
* The HALT instruction ends the program and executes all instructions
* in the Instruction List on the data/values that have been entered
*/
HALT        :   ('H' 'A' 'L' 'T');

score 2 · Accepted Answer

ANTLR 生成的词法分析器是这样工作的：它尝试尽可能多地匹配，当两个（或更多）规则匹配相同数量的字符时，首先定义的规则将“获胜”。因此，您的VALUE规则永远不会从MEMVAL规则中“获胜”，因为匹配的所有VALUE内容也匹配MEMVAL's: '-' ('0'..'9')+。

因此，您会看到错误消息。

无论您的某个解析器规则VALUE在某个时刻是否需要一个标记，词法分析器都会根据我提到的规则简单地生成一个标记：词法分析器不会考虑来自解析器的任何信息。

只需删除VALUE规则并将其替换为MEMVAL（也许重命名MEMVAL为INT）。然后在您的解析器规则中简单地匹配MEMVAL（或INT）并检查该值是否在特定的数值范围内。

antlr - ANTLR 语法令牌问题（ANTLR 有效）

1 回答 1

Related

Reference