我实际上正在尝试调整此处列出的橡皮鸭语法以进行 python 解析。为此,我改变了语法中的语义谓词:
在第 297 和 298 行从
{_input.Lt(1).Text.Equals("A")}?
到{self.input.LT(1).getText().__eq__("A")}?
在第 571 行从
{_input.La(-1) == NEWLINE}?
到{self.input.LA(-1) == NEWLINE}?
然后我使用官方文档给出的标准命令编译语法:antlr4 -Dlanguage=Python2 VBALexer.g4 && antlr4 -Dlanguage=Python2 VBAParser.g4
此语法解析的结果如下:
warning(180): VBALexer.g4:300:15: chars '<INVALID>'..'/' used multiple times in set [[\]()\r\n\t.,'"|!@#$%^&*\-+:=; 0-9-/\\]
warning(180): VBALexer.g4:302:18: chars '\uFFFD' used multiple times in set [a-zA-Z_������]
warning(180): VBALexer.g4:302:18: chars '\uFFFD' used multiple times in set [a-zA-Z_������]
warning(180): VBALexer.g4:302:18: chars '\uFFFD' used multiple times in set [a-zA-Z_������]
warning(180): VBALexer.g4:302:18: chars '\uFFFD' used multiple times in set [a-zA-Z_������]
warning(180): VBALexer.g4:302:18: chars '\uFFFD' used multiple times in set [a-zA-Z_������]
warning(180): VBALexer.g4:304:25: chars '\uFFFD' used multiple times in set [a-zA-Z0-9_������]
warning(180): VBALexer.g4:304:25: chars '\uFFFD' used multiple times in set [a-zA-Z0-9_������]
warning(180): VBALexer.g4:304:25: chars '\uFFFD' used multiple times in set [a-zA-Z0-9_������]
warning(180): VBALexer.g4:304:25: chars '\uFFFD' used multiple times in set [a-zA-Z0-9_������]
warning(180): VBALexer.g4:304:25: chars '\uFFFD' used multiple times in set [a-zA-Z0-9_������]
warning(154): VBAParser.g4:626:0: rule lExpression contains an optional block with at least one alternative that can match an empty string
warning(154): VBAParser.g4:626:0: rule lExpression contains an optional block with at least one alternative that can match an empty string
warning(154): VBAParser.g4:646:0: rule argumentList contains an optional block with at least one alternative that can match an empty string
warning(154): VBAParser.g4:646:0: rule argumentList contains an optional block with at least one alternative that can match an empty string
然后我尝试解析以下简单的 VBA 代码:
Public Sub Module()
X = 0
Y = 1
Z = 2
Test1 = X + Y * Z
Test2 = X - Y * Z
Test3 = Z + X * Y - Y * Z
Test4 = Z + X * Y Mod Z * 2 + 5 * Z
Test5 = Z + X ^ 3 * Y Mod Z * 2 + 5 * Z ^ X
Print #1, Test2 ; " is not a Boolean value"
End Sub
使用标准 python 代码:
import antlr4
from VBALexer import VBALexer
from VBAParser import VBAParser
input = antlr4.InputStream(data)
lexer = VBALexer(input)
tokens = antlr4.CommonTokenStream(lexer)
parser = VBAParser(tokens)
tree = parser.startRule()
这给了我以下信息:
line 1:0 extraneous input 'Public' expecting {<EOF>, ':', <INVALID>, <INVALID>, <INVALID>, <INVALID>, ''', WS, LINE_CONTINUATION}
line 1:6 mismatched input ' ' expecting {<INVALID>, <INVALID>, <INVALID>}
如果我打印树,我会得到以下信息:
(startRule (module Public moduleAttributes Sub Module ( ) \n X = 0 \n Y = 1 \n Z = 2 \n Test1 = X + Y * Z \n Test2 = X - Y * Z \n Test3 = Z + X * Y - Y * Z \n Test4 = Z + X * Y Mod Z * 2 + 5 * Z \n Test5 = Z + X ^ 3 * Y Mod Z * 2 + 5 * Z ^ X \n Print # 1 , Test2 ; " is not a Boolean value" \n End Sub \n) <EOF>)
但是,如果我使用以下方法使语法适应 java:
在第 297 和 298 行从
{_input.Lt(1).Text.Equals("A")}?
到{_input.LT(1).getText().equals("A")}?
在第 571 行从
{_input.La(-1) == NEWLINE}?
到{_input.LA(-1) == NEWLINE}?
然后运行语法解析(获得完全相同的警告)并使用 javac 编译代码,我可以grun
在示例上正确运行:
grun VBA startRule -tree
Public Sub Module()
X = 0
Y = 1
Z = 2
Test1 = X + Y * Z
Test2 = X - Y * Z
Test3 = Z + X * Y - Y * Z
Test4 = Z + X * Y Mod Z * 2 + 5 * Z
Test5 = Z + X ^ 3 * Y Mod Z * 2 + 5 * Z ^ X
Print #1, Test2 ; " is not a Boolean value"
End Sub
(startRule (module moduleAttributes moduleAttributes moduleAttributes moduleDeclarations moduleAttributes (moduleBody (moduleBodyElement (subStmt (visibility Public) (whiteSpace ) Sub (whiteSpace ) (subroutineName (identifier (untypedIdentifier (identifierValue Module)))) (argList ( )) (endOfStatement (endOfLine \n (whiteSpace ))) (block (blockStmt (mainBlockStmt (letStmt (lExpression (identifier (untypedIdentifier (identifierValue X)))) (whiteSpace ) = (whiteSpace ) (expression (literalExpression (numberLiteral 0)))))) (endOfStatement (endOfLine \n (whiteSpace ))) (blockStmt (mainBlockStmt (letStmt (lExpression (identifier (untypedIdentifier (identifierValue Y)))) (whiteSpace ) = (whiteSpace ) (expression (literalExpression (numberLiteral 1)))))) (endOfStatement (endOfLine \n (whiteSpace ))) (blockStmt (mainBlockStmt (letStmt (lExpression (identifier (untypedIdentifier (identifierValue Z)))) (whiteSpace ) = (whiteSpace ) (expression (literalExpression (numberLiteral 2)))))) (endOfStatement (endOfLine \n (whiteSpace ))) (blockStmt (mainBlockStmt (letStmt (lExpression (identifier (untypedIdentifier (identifierValue Test1)))) (whiteSpace ) = (whiteSpace ) (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue X))))) (whiteSpace ) + (whiteSpace ) (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue Y))))) (whiteSpace ) * (whiteSpace ) (expression (lExpression (identifier (untypedIdentifier (identifierValue Z)))))))))) (endOfStatement (endOfLine \n (whiteSpace ))) (blockStmt (mainBlockStmt (letStmt (lExpression (identifier (untypedIdentifier (identifierValue Test2)))) (whiteSpace ) = (whiteSpace ) (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue X))))) (whiteSpace ) - (whiteSpace ) (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue Y))))) (whiteSpace ) * (whiteSpace ) (expression (lExpression (identifier (untypedIdentifier (identifierValue Z)))))))))) (endOfStatement (endOfLine \n (whiteSpace ))) (blockStmt (mainBlockStmt (letStmt (lExpression (identifier (untypedIdentifier (identifierValue Test3)))) (whiteSpace ) = (whiteSpace ) (expression (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue Z))))) (whiteSpace ) + (whiteSpace ) (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue X))))) (whiteSpace ) * (whiteSpace ) (expression (lExpression (identifier (untypedIdentifier (identifierValue Y))))))) (whiteSpace ) - (whiteSpace ) (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue Y))))) (whiteSpace ) * (whiteSpace ) (expression (lExpression (identifier (untypedIdentifier (identifierValue Z)))))))))) (endOfStatement (endOfLine \n (whiteSpace ))) (blockStmt (mainBlockStmt (letStmt (lExpression (identifier (untypedIdentifier (identifierValue Test4)))) (whiteSpace ) = (whiteSpace ) (expression (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue Z))))) (whiteSpace ) + (whiteSpace ) (expression (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue X))))) (whiteSpace ) * (whiteSpace ) (expression (lExpression (identifier (untypedIdentifier (identifierValue Y)))))) (whiteSpace ) Mod (whiteSpace ) (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue Z))))) (whiteSpace ) * (whiteSpace ) (expression (literalExpression (numberLiteral 2)))))) (whiteSpace ) + (whiteSpace ) (expression (expression (literalExpression (numberLiteral 5))) (whiteSpace ) * (whiteSpace ) (expression (lExpression (identifier (untypedIdentifier (identifierValue Z)))))))))) (endOfStatement (endOfLine \n (whiteSpace ))) (blockStmt (mainBlockStmt (letStmt (lExpression (identifier (untypedIdentifier (identifierValue Test5)))) (whiteSpace ) = (whiteSpace ) (expression (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue Z))))) (whiteSpace ) + (whiteSpace ) (expression (expression (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue X))))) (whiteSpace ) ^ (whiteSpace ) (expression (literalExpression (numberLiteral 3)))) (whiteSpace ) * (whiteSpace ) (expression (lExpression (identifier (untypedIdentifier (identifierValue Y)))))) (whiteSpace ) Mod (whiteSpace ) (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue Z))))) (whiteSpace ) * (whiteSpace ) (expression (literalExpression (numberLiteral 2)))))) (whiteSpace ) + (whiteSpace ) (expression (expression (literalExpression (numberLiteral 5))) (whiteSpace ) * (whiteSpace ) (expression (expression (lExpression (identifier (untypedIdentifier (identifierValue Z))))) (whiteSpace ) ^ (whiteSpace ) (expression (lExpression (identifier (untypedIdentifier (identifierValue X))))))))))) (endOfStatement (endOfLine \n (whiteSpace ))) (blockStmt (mainBlockStmt (fileStmt (printStmt Print (whiteSpace ) (markedFileNumber # (expression (literalExpression (numberLiteral 1)))) , (whiteSpace ) (outputList (outputItem (outputClause (outputExpression (expression (lExpression (identifier (untypedIdentifier (identifierValue Test2)))))))) (whiteSpace ) (outputItem (charPosition ;)) (whiteSpace ) (outputItem (outputClause (outputExpression (expression (literalExpression " is not a Boolean value")))))))))) (endOfStatement (endOfLine (whiteSpace ) \n))) End Sub)) (endOfStatement (endOfLine \n))) moduleAttributes) <EOF>)
这是我在 python 中的语义谓词中的一个简单的拼写错误吗?或者它可能是一个潜在的 antlr python 运行时错误?在此先感谢您的帮助!