我使用以下代码进行就地向前向后 FIR 过滤:
lena = len(a)
lenb = len(b)
convol = zeros(a.shape)
code = """
// Forward convolution
int pad = (lenb-1)/2;
int i, j;
for (i=pad; i<pad+lena; i++)
{
int kmin, kmax, k;
// Reverse indexing for the next pass
j = lena-1-i+pad;
convol(j) = 0;
kmin = (i >= lenb - 1) ? i - (lenb - 1) : 0;
kmax = (i < lena - 1) ? i : lena - 1;
for (k = kmin; k <= kmax; k++)
{
convol(j) += a(k)*b(i - k);
}
}
// Backward convolution (the signal in convol has been
// reversed using reversed indexes)
for (i=pad; i<pad+lena; i++)
{
int kmin, kmax, k;
// Reverse indexing for reordering the output vector
j = lena-1-i+pad;
a(j) = 0;
kmin = (i >= lenb - 1) ? i - (lenb - 1) : 0;
kmax = (i < lena - 1) ? i : lena - 1;
for (k = kmin; k <= kmax; k++)
{
a(j) += convol(k)*b(i - k);
}
}
return_val = 1;
"""
weave.inline(code, [ 'a', 'b', 'lena', 'lenb', 'convol'],
type_converters=converters.blitz, compiler = 'g++')
当然,“convol”变量不需要在 C/C++ 范围之外看到,我需要空间和处理优化。因此,将这段代码替换为以下代码是有意义的(至少对我而言):
lena = len(a)
lenb = len(b)
code = """
// Forward convolution
int pad = (lenb-1)/2;
int i, j;
float* convol = (float*) calloc(lena, sizeof(float));
for (i=pad; i<pad+lena; i++)
{
int kmin, kmax, k;
// Reverse indexing for the next pass
j = lena-1-i+pad;
convol[j] = 0;
kmin = (i >= lenb - 1) ? i - (lenb - 1) : 0;
kmax = (i < lena - 1) ? i : lena - 1;
for (k = kmin; k <= kmax; k++)
{
convol[j] += a(k)*b(i - k);
}
}
// Backward convolution (the signal in convol has been
// reversed using reversed indexes)
for (i=pad; i<pad+lena; i++)
{
int kmin, kmax, k;
// Reverse indexing for reordering the output vector
j = lena-1-i+pad;
a(j) = 0;
kmin = (i >= lenb - 1) ? i - (lenb - 1) : 0;
kmax = (i < lena - 1) ? i : lena - 1;
for (k = kmin; k <= kmax; k++)
{
a(j) += convol[k]*b(i - k);
}
}
free(convol);
return_val = 1;
"""
weave.inline(code, [ 'a', 'b', 'lena', 'lenb'],
type_converters=converters.blitz, compiler = 'g++')
唯一的区别是我没有使用 numpy 数组,而是直接使用了 C 浮点数组。问题是第二个代码的处理时间是第一个代码的 2 倍……这是为什么呢?第二个版本有问题吗?应该更快……!