我正在使用 CUP 和 JFlex 来验证表达式语法。我有基本的功能:我可以判断一个表达式是否有效。
下一步是实现简单的算术运算,例如“加 1”。例如,如果我的表达式是“1 + a”,那么结果应该是“2 + a”。我需要访问解析树才能做到这一点,因为简单地识别一个数字项是行不通的:将 1 添加到 "(1 + a) * b" 的结果应该是 "(1 + a) * b + 1" ,而不是“(2 + a)* b”。
有没有人有一个生成解析树的 CUP 示例?我想我可以从那里拿走它。
作为额外的奖励,有没有办法使用 JFlex 获取表达式中所有标记的列表?似乎是一个典型的用例,但我不知道该怎么做。
编辑:找到关于堆栈溢出的有希望的线索: 从解析器创建抽象树问题
CUP和AST的讨论:
http://pages.cs.wisc.edu/~fischer/cs536.s08/lectures/Lecture16.4up.pdf
具体来说,这一段:
解析器返回的符号与语法的开始符号相关联,并包含整个源程序的 AST
这无济于事。如果 Symbol 类没有任何指向其子级的导航指针,如何遍历给定Symbol实例的树?换句话说,它看起来或行为不像树节点:
package java_cup.runtime;
/**
* Defines the Symbol class, which is used to represent all terminals
* and nonterminals while parsing. The lexer should pass CUP Symbols
* and CUP returns a Symbol.
*
* @version last updated: 7/3/96
* @author Frank Flannery
*/
/* ****************************************************************
Class Symbol
what the parser expects to receive from the lexer.
the token is identified as follows:
sym: the symbol type
parse_state: the parse state.
value: is the lexical value of type Object
left : is the left position in the original input file
right: is the right position in the original input file
******************************************************************/
public class Symbol {
/*******************************
Constructor for l,r values
*******************************/
public Symbol(int id, int l, int r, Object o) {
this(id);
left = l;
right = r;
value = o;
}
/*******************************
Constructor for no l,r values
********************************/
public Symbol(int id, Object o) {
this(id, -1, -1, o);
}
/*****************************
Constructor for no value
***************************/
public Symbol(int id, int l, int r) {
this(id, l, r, null);
}
/***********************************
Constructor for no value or l,r
***********************************/
public Symbol(int sym_num) {
this(sym_num, -1);
left = -1;
right = -1;
value = null;
}
/***********************************
Constructor to give a start state
***********************************/
Symbol(int sym_num, int state)
{
sym = sym_num;
parse_state = state;
}
/*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/
/** The symbol number of the terminal or non terminal being represented */
public int sym;
/*. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .*/
/** The parse state to be recorded on the parse stack with this symbol.
* This field is for the convenience of the parser and shouldn't be
* modified except by the parser.
*/
public int parse_state;
/** This allows us to catch some errors caused by scanners recycling
* symbols. For the use of the parser only. [CSA, 23-Jul-1999] */
boolean used_by_parser = false;
/*******************************
The data passed to parser
*******************************/
public int left, right;
public Object value;
/*****************************
Printing this token out. (Override for pretty-print).
****************************/
public String toString() { return "#"+sym; }
}