在矩阵乘法中,我们做这样的事情
for (i = 0; i < N; i = i + 1)
for (j = 0; j < N; j = j + 1)
A[i*N + j] = (double) random() / SOME_NUMBER;
for (i = 0; i < N; i = i + 1)
for (j = 0; j < N; j = j + 1)
B[i*N + j] = (double) random() / SOME_NUMBER;
for (i = 0; i < N; i = i + 1)
for (j = 0; j < N; j = j + 1)
for (k = 0; k < N; k = k + 1)
C[i*N + j] = C[i*N + j] + A[i*N + k]*B[k*N + j];
我们如何增加数据的局部性以优化乘法循环