optimization - gcc transformation pass for ackermann

Question

gcc (I tried 4.7.2 on Mac and Linux with -O3 flag) optimizes ackermann function to a single call with a big local stack. An example Ackermann code below:

int ack(int m,int n){
  if(m == 0) return n+1;
  if(n == 0) return ack(m-1,1);
  return ack(m-1,ack(m,n-1));
}

When disassembled, there is only one recursive call to ack function, instead of two calls (I couldn't parse what is going on -ack is now transformed by gcc into a function with 8 arguments, and local stack of 49 int and 9 char). I tried to look up what kind of transformation passes gcc did to optimize Ackermann function to a single call, but didn't find anything of interest. I will appreciate pointers on what major transformation passes gcc performed to convert the deeply recursive Ackermann to a single recursive call. LLVM gcc (I tried v4.2 on mac) doesn't reduce it to single recursive call yet, and is 4x slower with -O3 flag. This optimization seems very interesting.

score 7 · Accepted Answer

第一遍是尾调用消除。GCC 在大多数优化级别都会这样做。本质上，尾部位置的所有函数调用都转换为 goto，如下所示：

int ack(int m, int n) {
begin:
  if (m == 0) return n + 1;
  if (n == 0) { m -= 1; n = 1; goto begin; }
  n = ack(m, n-1); m -= 1; goto begin;
}

现在只剩下一个递归调用，而 GCC 仅在 -O3 级别将其内联了几次迭代。导致你看到的巨大怪物。

optimization - gcc transformation pass for ackermann

1 回答 1

Related

Reference