5

The ultimate in optimizing compilers would be one that searched among the space of programs for a program equivalent to the original but faster. This has been done in practice for very small basic blocks: https://en.wikipedia.org/wiki/Superoptimization

It sounds like the hard part is the exponential nature of the search space, but actually it's not; the hard part is, supposing you find what you're looking for, how do you prove that the new, faster program is really equivalent to the original?

Last time I looked into it, some progress had been made on proving certain properties of programs in certain contexts, particularly at a very small scale when you are talking about scalar variables or small fixed bit vectors, but not really on proving equivalence of programs at a larger scale when you are talking about complex data structures.

Has anyone figured out a way to do this yet, even 'modulo solving this NP-hard search problem that we don't know how to solve yet'?

Edit: Yes, we all know about the halting problem. It's defined in terms of the general case. Humans are an existence proof that this can be done for many practical cases of interest.

4

1 回答 1

1

你问了一个相当广泛的问题,但让我看看我能不能让你继续前进。

John Regehr 在调查一些关于超级优化器的相关论文方面做得非常好:https ://blog.regehr.org/archives/923

问题是你真的不需要证明这些优化类型的整个程序等价。相反,您只需要证明给定 CPU 处于特定状态,两个代码序列以相同的方式修改 CPU 状态。为了在许多优化(即大规模)中证明这一点,通常您可能首先在两个序列中抛出一些随机输入。如果它们不是等效的代码,那么您可能会很幸运并很快证明这一点(通过矛盾证明)并且您可以继续前进。如果您没有发现矛盾,您现在可以尝试通过计算成本高的 SAT 求解器来证明等价性。(顺便说一句,如果您对超级优化器感兴趣,那么 Regehr 提到的 STOKE 论文特别有趣。)

现在看看整个程序的语义等价,这里的一种方法是 CompCert 编译器使用的方法。本质上,编译器正在证明这个定理:

如果 CompCert 能够将 C 代码 X 转换为汇编代码 Y,那么 X 和 Y 在语义上是等价的。

此外,CompCert 确实应用了一些编译器优化,实际上这些优化往往是传统编译器出错的地方。也许像 CompCert 这样的东西是您所追求的,在这种情况下,编译器通过一系列细化传递来处理事情,在这些传递中它证明如果每次传递成功,结果在语义上等同于前一次传递。

于 2017-08-28T19:06:51.797 回答