antlr4 - 检查词法分析器中的前一个/左标记

Question

例如，如何在词法分析器中找到前一个/左标记

lexer grammar TLexer;

ID     : [a-zA-Z] [a-zA-Z0-9]*;
CARET  : '^';
RTN    : {someCond1}? CARET ID; // CARET not include this token
GLB    : {someCond2}? CARET ID; // CARET not include this token

ETC

score 5 · Accepted Answer

谢谢，我是这样做的

lexer grammar TLexer;

@lexer::members {
    int lastTokenType = 0;
public void emit(Token token) {
    super.emit(token);
    lastTokenType = token.getType();
}
}

CARET  : '^';
RTN    : {someCond1&&(lastTokenType==CARET)}? ID;
GLB    : {someCond2&&(lastTokenType==CARET)}? ID;
ID     : [a-zA-Z] [a-zA-Z0-9]*;

score 0 · Accepted Answer

我查看了 Lexer 的源代码。Lexer 响应 nextToken() 调用（来自解析器）。我还没有发现它会跟踪以前的令牌。并且不能直接访问 CARET。鉴于此输入：

xyz ^abc

这个语法：

lexer grammar TLexer;

ID     : [a-zA-Z] [a-zA-Z0-9]* {System.out.println("ID ");} ;
CARET  : '^'                   {System.out.println("CARET ");} ;
WS     : [ \r\n] ;
RTN    : CARET ID {System.out.println("RTN " + _tokenStartCharIndex);} ;

输出是：

$ antlr4 TLexer.g4 
$ javac TLexer.java 
$ grun TLexer tokens -tokens -diagnostics -trace input.txt 
ID 
RTN 4
[@0,0:2='xyz',<1>,1:0]
[@1,3:3=' ',<3>,1:3]
[@2,4:7='^abc',<4>,1:4]
[@3,8:8='\n',<3>,1:8]
[@4,9:8='<EOF>',<-1>,2:9]

词法分析器为您提供一个类型为 <4> (RTN) 的单一标记作为 input ^abc。

antlr4 - 检查词法分析器中的前一个/左标记

2 回答 2

Related

Reference