java - Why an empty space appears in a found token (antlr)?

Question

I have trouble with some antlr lexing. I've added a token with a rule:

NUMBER   : [0-9]+.?[0-9]*;
WHITESPACE : [ \t\r\n]+ -> skip ;

I test my program using JUnit and if I use the following code:

@Test
public void testWhiteSpaces() {
    verifyLexer("   \n7 \t", new String[] {"7"});
}

public void verifyLexer(String input, String[] expectedTokens) {
    CharStream stream = new ANTLRInputStream(input);
    ExpressionLexer lexer = new ExpressionLexer(stream);
    lexer.reportErrorsAsExceptions();
    List<? extends Token> actualTokens = lexer.getAllTokens();

    assertEquals(expectedTokens.length, actualTokens.size());

    for(int i = 0; i < actualTokens.size(); i++) {
         String actualToken = actualTokens.get(i).getText();
         String expectedToken = expectedTokens[i];
         System.out.println(actualToken + "?");
         assertEquals(actualToken, expectedToken);
    }

The JUnit test fails and it says that the token that it found was "7 " instead "7" I was aiming for. How come? There is no spaces involved in my reg expression for NUMBER token...

score 2 · Accepted Answer

我认为您忘记在正则表达式中转义点：

[0-9]+\.?[0-9]*

点是匹配任何东西的特殊字符。你的情况下的空白。

java - Why an empty space appears in a found token (antlr)?

1 回答 1

Related

Reference