context-free-grammar - 将 CFG 转换为 IL

Question

我从任意 IL 构建 CFG，并希望将该 CFG 转换回 IL。CFG中的顶点顺序当然不等于原始IL指令的顺序。

这很好，但会使一些事情过于复杂。想象：

Jump 'B'
'C': Return
'B': Jump 'C'

这将导致这样的流程图： (Jump B) -> (Jump C) -> (Return) 这当然是一个简化的示例，但它显示了转换出 CFG 时的问题。

学术界有没有关于这个话题的信息？我认为自下而上遍历图表会非常优雅，但这在更复杂的情况下不起作用。

一个解决方案可能是自上而下搜索 CF 合并，但在这种情况下，我将无法正确处理循环。因此，获得此权利的唯一方法似乎是搜索可能的 CF 合并（如果发生）。如果没有，我们必须有一个循环，这意味着循环是首选的，并且随后会评估持续的路径。这听起来像是一个可以解决的问题，但它也非常昂贵，并且可能存在更优雅的问题解决方案。此外，在考虑“break”语句时，循环也可能导致 CF 合并。

score 2 · Accepted Answer

将 CFG 转换为 IL：您希望遍历图，将每个顶点恰好发射一次（不可达的除外）。因此，您需要记录已发出哪些顶点：顶点上的标志会做，或者从顶点到真/假的哈希值。

有些顶点会有多个后继者，你只能直接跟随其中一个；所以你想要一种方法来跟踪你想要稍后返回的顶点。队列适合于此。

这或多或少是我用过的。

def needs_label(cfg, v, last):
    if cfg.predecessors(v) > 1:
        # There's more than one way of entering this vertex
        return True
    elif cfg.predecessors(v) == 1 and last != cfg.predecessors(v)[0]:
        # There's only one way, but the last vertex emitted was not that way
        # so it will be entered using a jump.
        return True
    else:
        return False

def emit_label(v):
    print 'label%d' % (v.id)

def emit_vertex(v):
    if v.type == 'branch':
        # Branch to second successor
        print 'br label%d' % cfg.successors(v)[1].id
    else:
        ...

def emit_jump(v):
    print 'jmp label%d' % v.id

def emit_cfg(cfg):
    q = Queue()   # Queue for saving vertices that we want to emit later
    done = {}    # Hash recording which vertices have already been emitted
    q.push(cfg.start())
    while not q.empty():
        v = q.pop()
        last = None
        while v is not None and not done[v]:
            # Emit the vertex, with a prefixed label if necessary
            if needs_label(cfg, v, last):
                emit_label(v)
            emit_vertex(v)
            done[v] = True
            last = v
            # Get the vertex's successors
            succs = cfg.successors(v)
            # If there aren't any, then this path is finished, so go back to
            # the outer loop to pop another saved vertex
            if len(succs) == 0:
                v = None   # Setting this will terminate the inner loop
                continue
            # Stick all the vertices on the stack for later, in case we can't
            # process them all here
            for s in succs:
                q.push(s)
            # Pick a new vertex from the list of successors.  Always pick the first
            # because if it's a branch then the second will have been branched on
            v = succs[0]
            # If it was emitted earlier we need to jump to it
            if done[v]:
                emit_jump(v)
                v = None
            # Otherwise continue the inner loop, walking from the new vertex

分支（具有多个后继的顶点）的处理非常幼稚：通常你想找出哪个更有可能，如果可能的话直接跟随那个。

score 0 · Accepted Answer

这比听起来容易。基本上，可以去掉 CFG 中的任何 Jump 语句，从而得到优化的图形。一旦图形被线性化，跳转语句将被插入回来。这不会保持指令的原始顺序，但会产生具有相同控制流的方法。

context-free-grammar - 将 CFG 转换为 IL

2 回答 2

Related

Reference