我正在编写 c++ 代码以在 ARM cortex a9 CPU 上运行。我的代码链接到使用软浮点编译的封闭源 3rd 方库。我正在运行一个 cortex-a9 ARM cpu。
我注意到如果我使用 gcc 标志-mfloat-abi=softfp 编译我的代码,它的运行速度要比使用 -mfloat-abi=hard 编译快得多。
我认为硬浮动应该总是更快。是否有意义?
如何优化这些库调用?
谢谢!
一些注意事项:
- 库接口仅由整数、字符串和指针构成,并且工作正常。
- 加速大约是 x8,有利于 softfp。
- 关于二进制文件的 readelf 平台相关信息:
第三方库:
readelf -hA libXXX.so
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Shared object file)
Machine: ARM
Version: 0x1
Entry point address: 0x13780
Start of program headers: 52 (bytes into file)
Start of section headers: 1617724 (bytes into file)
Flags: 0x4000002, has entry point, Version4 EABI
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 7
Size of section headers: 40 (bytes)
Number of section headers: 28
Section header string table index: 27
Attribute Section: aeabi
File Attributes
Tag_CPU_name: "ARM9TDMI"
Tag_CPU_arch: v4T
Tag_ARM_ISA_use: Yes
Tag_THUMB_ISA_use: Thumb-1
Tag_ABI_PCS_wchar_t: 4
Tag_ABI_FP_denormal: Needed
Tag_ABI_FP_exceptions: Needed
Tag_ABI_FP_number_model: IEEE 754
Tag_ABI_align8_needed: Yes
Tag_ABI_align8_preserved: Yes, except leaf SP
Tag_ABI_enum_size: int
Tag_ABI_optimization_goals: Aggressive Speed
我的二进制文件:
readelf -hA XXX
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: ARM
Version: 0x1
Entry point address: 0x1b0d4
Start of program headers: 52 (bytes into file)
Start of section headers: 1392964 (bytes into file)
Flags: 0x5000002, has entry point, Version5 EABI
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 8
Size of section headers: 40 (bytes)
Number of section headers: 38
Section header string table index: 35
Attribute Section: aeabi
File Attributes
Tag_CPU_name: "Cortex-A9"
Tag_CPU_arch: v7
Tag_CPU_arch_profile: Application
Tag_ARM_ISA_use: Yes
Tag_THUMB_ISA_use: Thumb-2
Tag_VFP_arch: VFPv3
Tag_NEON_arch: NEONv1
Tag_ABI_PCS_wchar_t: 4
Tag_ABI_FP_denormal: Needed
Tag_ABI_FP_exceptions: Needed
Tag_ABI_FP_number_model: IEEE 754
Tag_ABI_align8_needed: Yes
Tag_ABI_align8_preserved: Yes, except leaf SP
Tag_ABI_enum_size: int
Tag_ABI_HardFP_use: SP and DP
Tag_ABI_VFP_args: VFP registers
Tag_unknown_34: 1 (0x1)
Tag_unknown_42: 1 (0x1)
Tag_unknown_44: 1 (0x1)
Tag_unknown_68: 1 (0x1)