在课堂上,我们得到了一个应该向量化的简单循环。这很顺利,但我们遇到了一件奇怪的事情。考虑这段代码:
#include<stdio.h>
void func(int N, double *a, double *b, double *c, double *d) {
int i;
#pragma ivdep
for ( i=0; i<N; i++ ) {
d[i] = c[i+1];
}
#pragma ivdep
for ( i=0; i<N; i++ ) {
a[i] = b[i];
c[i] = a[i] + b[i];
}
}
这是 ICC 的输出(命令icc -O2 -vec-report3 -c example.c
,版本 13.0.1):
example.c(6): (col. 3) remark: LOOP WAS VECTORIZED.
example.c(6): (col. 3) remark: loop was not vectorized: not inner loop.
example.c(10): (col. 3) remark: LOOP WAS VECTORIZED.
我的汇编程序不够流利,无法阅读-S
转储,所以我不知道它实际上做了什么;但由于我没有理由不对第一个循环进行矢量化,我认为它确实如此。
这些相互矛盾的信息的原因是什么?
在开放方面,GCC 4.5.4 (command gcc -O3 -ftree-vectorizer-verbose=1 -c example.c
) 向量化了两个循环。另一方面,GCC 4.6.4 打印:
example.c:10: note: created 3 versioning for alias checks.
example.c:10: note: LOOP VECTORIZED.
example.c:3: note: vectorized 1 loops in function.
GCC 4.8.0 更加冗长:
Analyzing loop at example.c:10
Vectorizing loop at example.c:10
example.c:10: note: create runtime check for data references *_24 and *_21
example.c:10: note: create runtime check for data references *_24 and *_27
example.c:10: note: create runtime check for data references *_21 and *_27
example.c:10: note: created 3 versioning for alias checks.
example.c:10: note: === vect_do_peeling_for_loop_bound ===Setting upper bound of nb iterations for epilogue loop to 0
example.c:10: note: LOOP VECTORIZED.
Analyzing loop at example.c:6
example.c:3: note: vectorized 1 loops in function.
example.c:10: note: Turned loop into non-loop; it never loops.
两者都没有说任何关于第一个循环的注意事项,但 4.8.0 似乎在第二个循环上自相矛盾。
这里发生了什么?