2

我实际上正在尝试调整此处列出的橡皮鸭语法以进行 python 解析。为此,我改变了语法中的语义谓词:

  • 在第 297 和 298 行从{_input.Lt(1).Text.Equals("A")}?{self.input.LT(1).getText().__eq__("A")}?

  • 在第 571 行从{_input.La(-1) == NEWLINE}?{self.input.LA(-1) == NEWLINE}?

然后我使用官方文档给出的标准命令编译语法:antlr4 -Dlanguage=Python2 VBALexer.g4 && antlr4 -Dlanguage=Python2 VBAParser.g4

此语法解析的结果如下:

warning(180): VBALexer.g4:300:15: chars '<INVALID>'..'/' used multiple times in set [[\]()\r\n\t.,'"|!@#$%^&*\-+:=; 0-9-/\\]
warning(180): VBALexer.g4:302:18: chars '\uFFFD' used multiple times in set [a-zA-Z_������]
warning(180): VBALexer.g4:302:18: chars '\uFFFD' used multiple times in set [a-zA-Z_������]
warning(180): VBALexer.g4:302:18: chars '\uFFFD' used multiple times in set [a-zA-Z_������]
warning(180): VBALexer.g4:302:18: chars '\uFFFD' used multiple times in set [a-zA-Z_������]
warning(180): VBALexer.g4:302:18: chars '\uFFFD' used multiple times in set [a-zA-Z_������]
warning(180): VBALexer.g4:304:25: chars '\uFFFD' used multiple times in set [a-zA-Z0-9_������]
warning(180): VBALexer.g4:304:25: chars '\uFFFD' used multiple times in set [a-zA-Z0-9_������]
warning(180): VBALexer.g4:304:25: chars '\uFFFD' used multiple times in set [a-zA-Z0-9_������]
warning(180): VBALexer.g4:304:25: chars '\uFFFD' used multiple times in set [a-zA-Z0-9_������]
warning(180): VBALexer.g4:304:25: chars '\uFFFD' used multiple times in set [a-zA-Z0-9_������]

warning(154): VBAParser.g4:626:0: rule lExpression contains an optional block with at least one alternative that can match an empty string
warning(154): VBAParser.g4:626:0: rule lExpression contains an optional block with at least one alternative that can match an empty string
warning(154): VBAParser.g4:646:0: rule argumentList contains an optional block with at least one alternative that can match an empty string
warning(154): VBAParser.g4:646:0: rule argumentList contains an optional block with at least one alternative that can match an empty string

然后我尝试解析以下简单的 VBA 代码:

Public Sub Module()
    X = 0
    Y = 1
    Z = 2
    Test1 = X + Y * Z
    Test2 = X - Y * Z
    Test3 = Z + X * Y - Y * Z
    Test4 = Z + X * Y Mod Z * 2 + 5 * Z
    Test5 = Z + X ^ 3 * Y Mod Z * 2 + 5 * Z ^ X
    Print #1, Test2 ; " is not a Boolean value" 
End Sub

使用标准 python 代码:

import antlr4
from VBALexer import VBALexer
from VBAParser import VBAParser

input = antlr4.InputStream(data)
lexer = VBALexer(input)
tokens = antlr4.CommonTokenStream(lexer)
parser = VBAParser(tokens)
tree = parser.startRule()

这给了我以下信息:

line 1:0 extraneous input 'Public' expecting {<EOF>, ':', <INVALID>, <INVALID>, <INVALID>, <INVALID>, ''', WS, LINE_CONTINUATION}
line 1:6 mismatched input ' ' expecting {<INVALID>, <INVALID>, <INVALID>}

如果我打印树,我会得到以下信息:

(startRule (module Public moduleAttributes   Sub   Module ( ) \n         X   =   0 \n         Y   =   1 \n         Z   =   2 \n         Test1   =   X   +   Y   *   Z \n         Test2   =   X   -   Y   *   Z \n         Test3   =   Z   +   X   *   Y   -   Y   *   Z \n         Test4   =   Z   +   X   *   Y   Mod   Z   *   2   +   5   *   Z \n         Test5   =   Z   +   X   ^   3   *   Y   Mod   Z   *   2   +   5   *   Z   ^   X \n         Print   # 1 ,   Test2   ;   " is not a Boolean value"   \n End Sub \n) <EOF>)

但是,如果我使用以下方法使语法适应 java:

  • 在第 297 和 298 行从{_input.Lt(1).Text.Equals("A")}?{_input.LT(1).getText().equals("A")}?

  • 在第 571 行从{_input.La(-1) == NEWLINE}?{_input.LA(-1) == NEWLINE}?

然后运行语法解析(获得完全相同的警告)并使用 javac 编译代码,我可以grun在示例上正确运行:

grun VBA startRule -tree
Public Sub Module()
    X = 0
    Y = 1
    Z = 2
    Test1 = X + Y * Z
    Test2 = X - Y * Z
    Test3 = Z + X * Y - Y * Z
    Test4 = Z + X * Y Mod Z * 2 + 5 * Z
    Test5 = Z + X ^ 3 * Y Mod Z * 2 + 5 * Z ^ X
    Print #1, Test2 ; " is not a Boolean value" 
End Sub
(startRule (module moduleAttributes moduleAttributes moduleAttributes moduleDeclarations moduleAttributes (moduleBody (moduleBodyElement (subStmt (visibility Public) (whiteSpace  ) Sub (whiteSpace  ) (subroutineName (identifier (untypedIdentifier (identifierValue Module)))) (argList ( )) (endOfStatement (endOfLine \n (whiteSpace        ))) (block (blockStmt (mainBlockStmt (letStmt (lExpression (identifier (untypedIdentifier (identifierValue X)))) (whiteSpace  ) = (whiteSpace  ) (expression (literalExpression (numberLiteral 0)))))) (endOfStatement (endOfLine \n (whiteSpace        ))) (blockStmt (mainBlockStmt (letStmt (lExpression (identifier (untypedIdentifier (identifierValue Y)))) (whiteSpace  ) = (whiteSpace  ) (expression (literalExpression (numberLiteral 1)))))) (endOfStatement (endOfLine \n (whiteSpace        ))) (blockStmt (mainBlockStmt (letStmt (lExpression (identifier (untypedIdentifier (identifierValue Z)))) (whiteSpace  ) = (whiteSpace  ) (expression (literalExpression (numberLiteral 2)))))) (endOfStatement (endOfLine \n (whiteSpace        ))) (blockStmt (mainBlockStmt (letStmt (lExpression (identifier (untypedIdentifier (identifierValue Test1)))) (whiteSpace  ) = (whiteSpace  ) (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue X))))) (whiteSpace  ) + (whiteSpace  ) (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue Y))))) (whiteSpace  ) * (whiteSpace  ) (expression (lExpression (identifier (untypedIdentifier (identifierValue Z)))))))))) (endOfStatement (endOfLine \n (whiteSpace        ))) (blockStmt (mainBlockStmt (letStmt (lExpression (identifier (untypedIdentifier (identifierValue Test2)))) (whiteSpace  ) = (whiteSpace  ) (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue X))))) (whiteSpace  ) - (whiteSpace  ) (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue Y))))) (whiteSpace  ) * (whiteSpace  ) (expression (lExpression (identifier (untypedIdentifier (identifierValue Z)))))))))) (endOfStatement (endOfLine \n (whiteSpace        ))) (blockStmt (mainBlockStmt (letStmt (lExpression (identifier (untypedIdentifier (identifierValue Test3)))) (whiteSpace  ) = (whiteSpace  ) (expression (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue Z))))) (whiteSpace  ) + (whiteSpace  ) (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue X))))) (whiteSpace  ) * (whiteSpace  ) (expression (lExpression (identifier (untypedIdentifier (identifierValue Y))))))) (whiteSpace  ) - (whiteSpace  ) (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue Y))))) (whiteSpace  ) * (whiteSpace  ) (expression (lExpression (identifier (untypedIdentifier (identifierValue Z)))))))))) (endOfStatement (endOfLine \n (whiteSpace        ))) (blockStmt (mainBlockStmt (letStmt (lExpression (identifier (untypedIdentifier (identifierValue Test4)))) (whiteSpace  ) = (whiteSpace  ) (expression (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue Z))))) (whiteSpace  ) + (whiteSpace  ) (expression (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue X))))) (whiteSpace  ) * (whiteSpace  ) (expression (lExpression (identifier (untypedIdentifier (identifierValue Y)))))) (whiteSpace  ) Mod (whiteSpace  ) (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue Z))))) (whiteSpace  ) * (whiteSpace  ) (expression (literalExpression (numberLiteral 2)))))) (whiteSpace  ) + (whiteSpace  ) (expression (expression (literalExpression (numberLiteral 5))) (whiteSpace  ) * (whiteSpace  ) (expression (lExpression (identifier (untypedIdentifier (identifierValue Z)))))))))) (endOfStatement (endOfLine \n (whiteSpace        ))) (blockStmt (mainBlockStmt (letStmt (lExpression (identifier (untypedIdentifier (identifierValue Test5)))) (whiteSpace  ) = (whiteSpace  ) (expression (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue Z))))) (whiteSpace  ) + (whiteSpace  ) (expression (expression (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue X))))) (whiteSpace  ) ^ (whiteSpace  ) (expression (literalExpression (numberLiteral 3)))) (whiteSpace  ) * (whiteSpace  ) (expression (lExpression (identifier (untypedIdentifier (identifierValue Y)))))) (whiteSpace  ) Mod (whiteSpace  ) (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue Z))))) (whiteSpace  ) * (whiteSpace  ) (expression (literalExpression (numberLiteral 2)))))) (whiteSpace  ) + (whiteSpace  ) (expression (expression (literalExpression (numberLiteral 5))) (whiteSpace  ) * (whiteSpace  ) (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue Z))))) (whiteSpace  ) ^ (whiteSpace  ) (expression (lExpression (identifier (untypedIdentifier (identifierValue X))))))))))) (endOfStatement (endOfLine \n (whiteSpace        ))) (blockStmt (mainBlockStmt (fileStmt (printStmt Print (whiteSpace  ) (markedFileNumber # (expression (literalExpression (numberLiteral 1)))) , (whiteSpace  ) (outputList (outputItem (outputClause (outputExpression (expression (lExpression (identifier (untypedIdentifier (identifierValue Test2)))))))) (whiteSpace  ) (outputItem (charPosition ;)) (whiteSpace  ) (outputItem (outputClause (outputExpression (expression (literalExpression " is not a Boolean value")))))))))) (endOfStatement (endOfLine (whiteSpace  ) \n))) End Sub)) (endOfStatement (endOfLine \n))) moduleAttributes) <EOF>)

这是我在 python 中的语义谓词中的一个简单的拼写错误吗?或者它可能是一个潜在的 antlr python 运行时错误?在此先感谢您的帮助!

4

0 回答 0