我正在尝试在 JavaCC 中编写一个解析器,它可以识别一种在令牌级别有一些歧义的语言。在这种特殊情况下,语言本身支持“/”标记作为除法运算符,同时它还支持正则表达式文字。
考虑以下 JavaCC 语法:
TOKEN :
{
...
< VAR : "var" > |
< DIV : "/" > |
< EQUALS : "=" > |
< SEMICOLON : ";" > |
...
}
TOKEN :
{
< IDENTIFIER : <IDENTIFIER_START> (<IDENTIFIER_START> | <IDENTIFIER_CHAR>)* > |
< #IDENTIFIER_START : ( [ "$","_","A"-"Z","a"-"z" ] )> |
< #IDENTIFIER_CHAR : ( [ "$","_","A"-"Z","a"-"z","0"-"9" ] ) > |
< REGEX_LITERAL : ("/" <REGEX_BODY> "/" ( <REGEX_FLAGS> )? ) > |
< #REGEX_BODY : ( <REGEX_FIRST_CHAR> <REGEX_CHARS> ) > |
< #REGEX_CHARS : ( <REGEX_CHAR> )* > |
< #REGEX_FIRST_CHAR : ( ~["\r", "\n", "*", "/", "\\"] | <BACKSLASH_SEQUENCE> ) > |
< #REGEX_CHAR : ( ~[ "\r", "\n", "/", "\\" ] | <BACKSLASH_SEQUENCE> ) > |
< #BACKSLASH_SEQUENCE : ("\\" ~[ "\r", "\n"] ) > |
< #REGEX_FLAGS : ( <IDENTIFIER_CHAR> )* >
}
给定以下代码:
var y = a/b/c;
可以生成两组不同的令牌。令牌流应该是:
<VAR> <IDENTIFIER> <EQUALS> <IDENTIFIER> <DIV> <IDENTIFIER> <DIV> <SEMICOLON>
或者
<VAR> <IDENTIFIER> <EQUALS> <IDENTIFIER> <REGEX_LITERAL> <SEMICOLON>
我如何确保 TokenManager 生成我期望在这种情况下的令牌流?