3

我有一个想法,可以制作一个简单的程序,这将帮助我在 C 等语言中使用运算符优先级。其中最困难的部分是给表达式加上括号。例如,我想要这个:

*a.x++ = *b.x++

转换为:

((*(((a).(x))++)) = (*(((b).(x))++)))

我在这些步骤中手动完成的:

           *a.x++ = *b.x++
       *(a).(x)++ = *(b).(x)++
     *((a).(x))++ = *((b).(x))++
   *(((a).(x))++) = *(((b).(x))++)
 (*(((a).(x))++)) = (*(((b).(x))++))
((*(((a).(x))++)) = (*(((b).(x))++)))

实现这一目标的最佳方法是什么?是否已经有我可以使用的解决方案?我更喜欢在 PHP、C、C++、Python 或 Ruby 中执行此操作。

(这不是我程序的全部想法,这只是第一步。)

4

10 回答 10

6

您将需要某种能够理解运算符优先级的解析器。C 的常用版本是 Lexx/Yacc 或 flex/bison,最简单的方法是构造解析树。完成此操作后,只需按“预排序”顺序遍历解析树,并在您进入和离开节点时发出括号。

于 2009-02-14T06:39:44.493 回答
4

最可靠的方法是解析表达式(当然要考虑优先规则),然后以自上而下的方式处理生成的 AST(抽象语法树),并在移动时添加括号

于 2009-02-14T06:40:11.200 回答
3

How about converting to postfix and evaluating. Can you try if the following approach works. Lets take *a.x++

Operator    Precedence  Arguments Needed
.           3               2
++          2               1
*           1               1

So now convert the expression to postfix notation. This should give you

a x . ++ *

Now evaluation of postfix is as simple as pushing things into a stack, when you hit an operator, pop the top n (as needed by operator) elements and pass them as arguments, store results back onto the stack. In your case, instead of evaluating, you'd return a textual representation of the operation

        Stack
Read a      a
Read x      x
Read .      (a.x)
Read ++     ((a.x)++)
Read *      (*((a.x)++))

if it helps, you may want to look at:
http://www.spsu.edu/cs/faculty/bbrown/web_lectures/postfix/
Bart de smet's DynCalc series of posts
My attempt at TDDing a similar solution

于 2009-02-14T07:00:19.123 回答
2

您可以从运算符创建二叉表达式树。

我相信网上已经有几种算法可以创建这样的树。

One simple way I could think of, is by sorting the operator by precedence, and then split the string into 2 parts by the operator with the lowest precedence first, then continue recursively splitting the other 2 parts down over and over and then eventually, you'll have the expression in binary tree form.

And then when you have the expression in binary tree form, you can then "parenthesize" up from the tree's leaves up to the root.

And of course, you could compile a full-fledged parser via yacc/bison.

于 2009-02-14T06:41:45.690 回答
2

Just pick up a parser for your selected language, for instance C parser, parse the expression/source code and print the AST back in the way you want.

test.c:

void main(void){
    int c = 2;
}

terminal:

$ python
>>> import pycparser
>>> test = pycparser.parse_file('test.c')
>>> test.show()
FileAST: 
  FuncDef: 
    Decl: main, [], []
      FuncDecl: 
        ParamList: 
          Typename: []
            TypeDecl: None, []
              IdentifierType: ['void']
        TypeDecl: main, []
          IdentifierType: ['void']
    Compound: 
      Decl: c, [], []
        TypeDecl: c, []
          IdentifierType: ['int']
        Constant: int, 2
>>> for node in test.ext:
...     print node
...
<pycparser.c_ast.FuncDef object at 0x7fe1436db750>
>>>
于 2009-02-14T09:05:36.700 回答
1

As a simple example:

Exp = Term | Exp, AddOp, Term 
Term = Factor | Term, MulOp, Factor
Factor = Number | Ident | PreOp, Factor | (, Exp, ) | Factor, PostOp

You can uses the grammar to write the translations:

Exp    = Term                    -> Term
       | Exp, AddOp, Term        -> (, Exp, AddOp, Term, )
Term   = Factor                  -> Factor
       | Term, MulOp, Factor     -> (, Term, MulOp, Factor, )
Factor = Number                  -> Number
       | Ident                   -> Ident
       | PreOp, Factor           -> (, PreOp, Factor, )
       | (, Exp, )               -> (, Exp, )
       | Factor, PostOp          -> (, Factor, PostOp, )

In which case:

a-- + b * (a+b) 

Translates to:

((a--) + (b * ((a+b))))
于 2009-02-14T08:11:16.287 回答
1

Parsing is a huge topic. Since you just want to use it to solve a specific problem, try not to get immersed in all these specific parsing algorithms that people are suggesting. Rather, there are numerous parser generators, such as antler or bison which, given an appropriate grammar, will parse text and allow you to perform programmatic operations on the components -- such as put parentheses around them. Some of these systems come with grammars for C, or have such grammars available.

antlr can generate parsers in any of the languages you mentioned; see http://www.antlr.org/

于 2009-02-14T08:58:58.450 回答
1

You can find "cparen" in the archives of the old net.sources newsgroup.

If you search (Google) for 'cparen', you get too much noise, but if you search for net.sources and 'cparen.c' that narrows the search enough to be useful.

Here is one website:

http://www.megalextoria.com/usenet-archive/news005f3/b14/net/sources/00000360.html

It is not a shell archive, as I would have expected. It looks like a pure ASCII text tar file. There are few enough files that you could just unpack it by hand.

于 2011-11-02T22:59:38.647 回答
1

I wrote a program in Python to parenthesize an expression string.

def pref(op):
    print "called with op", op
    ret = -1
    if op == '+':
        print "matched +"
        ret = 1
    if op == '-':
        print "matched -"
        ret = 2
    if op == '*':
        print "matched *"
        ret = 3
    if op == '/':
        print "matched /"
        ret = 4

    return ret

def evaluate(expr, operand_stack, operator_stack):
    print "**In evaluate**"
    print operator_stack
    print operand_stack

    expr1 = operand_stack.pop()
    expr2 = operand_stack.pop()
    op    = operator_stack.pop()

    # Parenthesize the expression
    expr = "(" + expr2 + op + expr1 + ")"
    print "expr1", expr1
    print "expr2", expr2
    print "expr", expr

    # Push the result back on the stack
    operand_stack.append(expr)

    print operator_stack
    print operand_stack
    print "**Out evaluate**"
    return expr

def looper(str, expr, operator_stack, operand_stack):
    l = 0
    cnt = len(str)

    # Loop over the input string
    while  l < cnt:
        if str[l] in ('+', '-', '*', '/'):
            print "operator found: op, index", str[l], l
            print operator_stack, len(operator_stack)

            x = len(operator_stack) - 1
            if x > 0:
                print "Comparing:", operator_stack[x], str[l]

                # If op on stack has higher preference than the op in question
                if (pref(operator_stack[x]) > pref(str[l])):
                    expr = evaluate(expr, operand_stack, operator_stack)
            operator_stack.append(str[l])
        else:
            # Add the operand to operand stack
            operand_stack.append(str[l])
        l += 1

    print operator_stack
    print operand_stack

    print "Take care of last elements"
    op_cnt = len(operator_stack)
    while op_cnt:
        expr = evaluate(expr, operand_stack, operator_stack)
        op_cnt -= 1

    print operator_stack
    print operand_stack

if __name__ == '__main__':
    str = "a+c*d-e/w*x+a-s"
    cnt = len(str)

    operand_stack  = []
    operator_stack  = []
    expr = ""
    looper(str, expr, operator_stack, operand_stack)

    print "Output=>", operand_stack[0]
于 2012-10-10T16:10:07.047 回答
0

There is a very old (1980's) open source program to do exactly this. It's called "cparen", but I'm damned if I can find it on the net. Only enthusiastic mentions of it, e.g. https://groups.google.com/group/comp.lang.c/tree/browse_frm/month/1990-03/1583b4728a6d94db http://www.language-c.info/re-should-i-capitalize-const-identifiers

If you have more luck than me in finding it, do write

于 2011-03-25T05:48:16.687 回答