当我在 ICC 中使用 -opt-report 或 -vec-report 选项编译给定文件时,我会收到以下消息:
foo.c(226:7-226:7):VEC:function_foo: loop was not vectorized: subscript too complex
foo.c(226): (col. 7) warning #13379: loop was not vectorized with "simd"
vectorization support: call to function absorbing_apply cannot be vectorized
loop was not vectorized: not inner loop
loop was not vectorized: unsupported loop structure
loop was not vectorized: subscript too complex
我知道这些信息的含义。我担心的是, foo.c:226
根本没有任何循环。其实还有什么是调用另一个函数。该函数确实包含一些循环,这些循环通过一个卷运行,并且确实按照 icc 报告的方式正确矢量化。但是,对该函数的所有调用都会给出与我粘贴的消息相同的消息。
icc 是否会因为它在根本没有循环的地方显示矢量化消息而陷入混乱?还是我误会了什么?
编辑:我已经半复制了这个问题。这一次,编译器告诉它向量化了一行代码,其中调用了另一个函数(在原始情况下只是另一种方式,它说它不能)。这是代码:
1
2
3 void foo(float *a, float *b, float *c, int n1, int n2, int n3, int ini3, int end3 ) {
4 int i, j, k;
5
6 for( i = ini3; i < end3; i++ ) {
7 for( j = 0; j < n2; j++ ) {
8 #pragma simd
9 #pragma ivdep
10 for( k = 0; k < 4; k ++ ) {
11 int index = k + j*n1 + i*n1*n2;
12 a[index] = b[index] + 2* c[index];
13 }
14 }
15 }
16
17 for( i = ini3; i < end3; i++ ) {
18 for( j = 0; j < n2; j++ ) {
19 #pragma simd
20 #pragma ivdep
21 for( k = n1-4; k < n1; k ++ ) {
22 int index = k + j*n1 + i*n1*n2;
23 a[index] = b[index] + 2* c[index];
24 }
25 }
26 }
27
28 return;
29 }
30 int main(void){
31 int n1, n2, n3;
32 int ini3 = 20;
33 int end3 = 30;
34 n1 = n2 = n3 = 200;
35
36 float *a = malloc( n1 * n2 * n3 * sizeof(float ));
37 float *b = malloc( n1 * n2 * n3 * sizeof(float ));
38 float *c = malloc( n1 * n2 * n3 * sizeof(float ));
39
40 foo( a,b,c, n1, n2, n3, ini3, end3 );
41
42 ini3 += 50;
43 end3 += 50;
44
45 foo( a,b,c, n1, n2, n3, ini3, end3 );
46
47 free(a); free(b); free(c);
48
49 return 0;
50 }
51
以及 ICC 将第 40 行和第 45 行向量化的优化报告:
foo.c(40:4-40:4):VEC:main: LOOP WAS VECTORIZED
loop was not vectorized: not inner loop
loop was not vectorized: not inner loop
LOOP WAS VECTORIZED
loop was not vectorized: not inner loop
loop was not vectorized: not inner loop
foo.c(45:4-45:4):VEC:main: LOOP WAS VECTORIZED
loop was not vectorized: not inner loop
loop was not vectorized: not inner loop
这是正常的吗?