c - 将硬浮动链接到 softfp 性能不佳

Question

我正在编写 c++ 代码以在 ARM cortex a9 CPU 上运行。我的代码链接到使用软浮点编译的封闭源 3rd 方库。我正在运行一个 cortex-a9 ARM cpu。

我注意到如果我使用 gcc 标志-mfloat-abi=softfp 编译我的代码，它的运行速度要比使用 -mfloat-abi=hard 编译快得多。

我认为硬浮动应该总是更快。是否有意义？

如何优化这些库调用？

谢谢！

一些注意事项：

库接口仅由整数、字符串和指针构成，并且工作正常。
加速大约是 x8，有利于 softfp。
关于二进制文件的 readelf 平台相关信息：

第三方库：

readelf -hA libXXX.so
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           ARM
  Version:                           0x1
  Entry point address:               0x13780
  Start of program headers:          52 (bytes into file)
  Start of section headers:          1617724 (bytes into file)
  Flags:                             0x4000002, has entry point, Version4 EABI
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         7
  Size of section headers:           40 (bytes)
  Number of section headers:         28
  Section header string table index: 27
Attribute Section: aeabi
File Attributes
  Tag_CPU_name: "ARM9TDMI"
  Tag_CPU_arch: v4T
  Tag_ARM_ISA_use: Yes
  Tag_THUMB_ISA_use: Thumb-1
  Tag_ABI_PCS_wchar_t: 4
  Tag_ABI_FP_denormal: Needed
  Tag_ABI_FP_exceptions: Needed
  Tag_ABI_FP_number_model: IEEE 754
  Tag_ABI_align8_needed: Yes
  Tag_ABI_align8_preserved: Yes, except leaf SP
  Tag_ABI_enum_size: int
  Tag_ABI_optimization_goals: Aggressive Speed

我的二进制文件：

readelf -hA XXX
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           ARM
  Version:                           0x1
  Entry point address:               0x1b0d4
  Start of program headers:          52 (bytes into file)
  Start of section headers:          1392964 (bytes into file)
  Flags:                             0x5000002, has entry point, Version5 EABI
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         8
  Size of section headers:           40 (bytes)
  Number of section headers:         38
  Section header string table index: 35
Attribute Section: aeabi
File Attributes
  Tag_CPU_name: "Cortex-A9"
  Tag_CPU_arch: v7
  Tag_CPU_arch_profile: Application
  Tag_ARM_ISA_use: Yes
  Tag_THUMB_ISA_use: Thumb-2
  Tag_VFP_arch: VFPv3
  Tag_NEON_arch: NEONv1
  Tag_ABI_PCS_wchar_t: 4
  Tag_ABI_FP_denormal: Needed
  Tag_ABI_FP_exceptions: Needed
  Tag_ABI_FP_number_model: IEEE 754
  Tag_ABI_align8_needed: Yes
  Tag_ABI_align8_preserved: Yes, except leaf SP
  Tag_ABI_enum_size: int
  Tag_ABI_HardFP_use: SP and DP
  Tag_ABI_VFP_args: VFP registers
  Tag_unknown_34: 1 (0x1)
  Tag_unknown_42: 1 (0x1)
  Tag_unknown_44: 1 (0x1)
  Tag_unknown_68: 1 (0x1)

score 0 · Accepted Answer

-mfloat-abi=softfp由和选择的两个 ABI-mfloat-abi=hard 不兼容。你不能混搭。

通常，您甚至不能softfp在系统上使用进程，hardfp除非您将所有库都复制到不同的lib目录中（即“multiarch”）。

如果您的代码碰巧没有使用float或double输入函数参数，那么您可能会发现它实际上确实有效，但您仍然不应该这样做，或者您在玩火。

无论如何，如果您的代码完全基于整数，那么这些选项对生成的代码没有影响，因此性能变化必须来自其他地方。当您意外指定选项时，您使用的编译器可能会自动选择不同的 multilib 或不同的 CPU -mfloat-abi（特别是 GCC 有切换回默认 multilib 的习惯）。可能你误开启NEON，或者从A8调到A9？

c - 将硬浮动链接到 softfp 性能不佳

1 回答 1

Related

Reference