3

I am implementing a Filter and I need to optimised as much as possible the implementation. I have realised that there is an instruction that need a lot of cycles and I do not understand why:

bool filters_apply(...)
{
   short sSample;
   double dSample;
   ...
   ...
   sSample = (short) dSample;   //needs a lot of cycles to execute
   ...
   ...
}

I am using de GCC Option: -mcpu=arm926ej-s -mfloat-abi=softfp -mfpu=vfp I have try to compile with the FP ABI "hard" to see if there is difference, but the compiler does not implement it.

Could anyone explain me why that instruction needs so many cycles?

Thanks a lot!!

4

1 回答 1

3

只需查看您提供的信息,这可能是因为当您将数据从浮点寄存器传输到臂寄存器时发生了停顿。

这个关于 arm 浮动模式的 Debian 页面声称,这种操作可能需要大约 20 个周期。

尽量使用浮点变量,例如转换sSample为浮点数。您的arm926ej-s ( vfpv2) 应提供 32 个单精度(16 个双精度)寄存器。

于 2013-06-13T12:19:54.577 回答