我创建了一个非常简单的 SQL 解析器,但是在模糊测试期间我遇到了这种情况:
SELECT 123 + ,
K_SELECT INTEGER T_PLUS T_COMMA
当然这是一个语法错误,但我不知道如何“捕捉”它。
它如何在“next_column_expression 来得太早”和“binary_expression 未完成”之间做出决定。我在 Java 项目上与 ANTLR3 合作过。但这完全不同。
这是骨架解析器规则:
/* be more versbose about error messages */
%error-verbose
/* keywords */
%token K_CREATE
%token K_FROM
%token K_INTEGER
%token K_SELECT
%token K_TABLE
%token K_TEXT
%token K_WHERE
%token K_VALUES
%token K_INSERT
%token K_INTO
/* variable tokens */
%token IDENTIFIER
%token INTEGER
/* fixed tokens */
%token T_ASTERISK
%token T_PLUS
%token T_EQUALS
%token T_END ";"
%token T_COMMA
%token T_BRACKET_OPEN
%token T_BRACKET_CLOSE
%token END 0 "end of file"
%%
input:
statement {
}
END
;
statement:
select_statement {
}
|
create_table_statement {
}
|
insert_statement {
}
;
keyword:
K_CREATE | K_FROM | K_INTEGER | K_SELECT | K_TABLE | K_TEXT | K_WHERE | K_VALUES | K_INSERT | K_INTO
;
table_name:
error {
// "Expected table name"
}
|
keyword {
// "You cannot use a keyword for a table name."
}
|
IDENTIFIER {
}
;
select_statement:
K_SELECT column_expression_list {
// "Expected FROM after column list."
}
error
|
K_SELECT error {
// "Expected column list after SELECT."
}
|
K_SELECT column_expression_list {
}
K_FROM table_name {
}
;
column_expression_list:
column_expression {
}
next_column_expression
;
column_expression:
T_ASTERISK {
}
|
expression {
}
;
next_column_expression:
|
T_COMMA column_expression {
}
next_column_expression
;
binary_expression:
value {
}
operator {
}
value {
}
;
expression:
value
|
binary_expression
;
operator:
T_PLUS {
}
|
T_EQUALS {
}
;
value:
INTEGER {
}
|
IDENTIFIER {
}
;
%%