8

我们使用 Shutting-Yard 算法来评估表达式。我们可以通过简单地应用算法来验证表达式。如果缺少操作数、不匹配的括号和其他东西,它会失败。然而,Shunting-Yard 算法具有比人类可读的中缀更大的支持语法。例如,

1 + 2
+ 1 2
1 2 +

是所有可接受的方式来提供“1+2”作为Shunting-Yard 算法的输入。'+ 1 2' 和 '1 2 +' 不是有效的中缀,但标准的 Shutting-Yard 算法可以处理它们。该算法并不真正关心顺序,它按优先顺序应用运算符,获取“最近”操作数。

我们希望将输入限制为有效的人类可读的中缀。我正在寻找一种方法来修改 Shunting-Yard 算法以使中缀无效或在使用 Shunting-Yard 之前提供中缀验证。

有谁知道任何已发表的技术可以做到这一点?我们必须同时支持基本运算符、自定义运算符、括号和函数(带有多个参数)。除了在线基本操作员之外,我还没有看到任何可以使用的东西。

谢谢

4

2 回答 2

10

我的问题的解决方案是使用 Rici 推荐的状态机来增强Wikipedia上发布的算法。我在这里发布伪代码是因为它可能对其他人有用。

Support two states, ExpectOperand and ExpectOperator.

Set State to ExpectOperand
While there are tokens to read:
    If token is a constant (number)
        Error if state is not ExpectOperand.
        Push token to output queue.
        Set state to ExpectOperator.
    If token is a variable.
        Error if state is not ExpectOperand.
        Push token to output queue.
        Set state to ExpectOperator.
    If token is an argument separator (a comma).
        Error if state is not ExpectOperator.
        Until the top of the operator stack is a left parenthesis  (don't pop the left parenthesis).
            Push the top token of the stack to the output queue.
            If no left parenthesis is encountered then error.  Either the separator was misplaced or the parentheses were mismatched.
        Set state to ExpectOperand.
    If token is a unary operator.
        Error if the state is not ExpectOperand.
        Push the token to the operator stack.
        Set the state to ExpectOperand.
    If the token is a binary operator.
        Error if the state is not ExpectOperator.
        While there is an operator token at the top of the operator stack and either the current token is left-associative and of lower then or equal precedence to the operator on the stack, or the current token is right associative and of lower precedence than the operator on the stack.
            Pop the operator from the operator stack and push it onto the output queue.
        Push the current operator onto the operator stack.
        Set the state to ExpectOperand. 
    If the token is a Function.
        Error if the state is not ExpectOperand.  
        Push the token onto the operator stack.
        Set the state to ExpectOperand.
    If the token is a open parentheses.
        Error if the state is not ExpectOperand.
        Push the token onto the operator stack.
        Set the state to ExpectOperand.
    If the token is a close parentheses.
         Error if the state is not ExpectOperator.
         Until the token at the top of the operator stack is a left parenthesis.
             Pop the token off of the operator stack and push it onto the output queue.
         Pop the left parenthesis off of the operator stack and discard.
         If the token at the top of the operator stack is a function then pop it and push it onto the output queue.
         Set the state to ExpectOperator.
At this point you have processed all the input tokens.
While there are tokens on the operator stack.
    Pop the next token from the operator stack and push it onto the output queue.
    If a parenthesis is encountered then error.  There are mismatched parenthesis.

通过查看前一个标记,您可以轻松地区分一元和二元运算符(我具体说的是负前缀和减法运算符)。如果没有前一个标记,前一个标记是一个左括号,或者前一个标记是一个运算符,那么你遇到了一元前缀运算符,否则你遇到了二元运算符。

于 2015-04-15T13:50:43.717 回答
3

关于调车场算法的一个很好的讨论是http://www.engr.mun.ca/~theo/Misc/exp_parsing.htm 那里提出的算法使用了运算符堆栈的关键思想,但有一些语法可以知道应该期待什么下一个。它有两个主要函数E(),它们需要一个表达式,并且P()需要一个前缀运算符、一个变量、一个数字、括号和函数。前缀运算符总是比二元运算符绑定得更紧密,所以你想先处理这个。

如果我们说 P 代表某个前缀序列并且 B 是二元运算符,那么任何表达式都将具有以下形式

P B P B P

即你要么期待一个前缀序列或一个二元运算符。正式的语法是

E -> P (B P)*

和 P 将是

P -> -P | variable | constant | etc.

这转换为伪代码

E() {
    P()
    while next token is a binary op:
         read next op
         push onto stack and do the shunting yard logic
         P()
    if any tokens remain report error
    pop remaining operators off the stack
}

P() {
    if next token is constant or variable:
         add to output
    else if next token is unary minus: 
         push uminus onto operator stack
         P()
}

您可以扩展它来处理其他一元运算符、函数、括号、后缀运算符。

于 2015-04-15T08:13:12.257 回答