c - 在嵌入式 MCU 应用程序中，在 for 循环中使用 uint_fast16_t 或 size_t 更好吗？

Question

我想为将在不同 MCU（16 位、32 位或 64 位基础）上运行的应用程序编写可移植代码。

MSP-430
nRF52（32 位）
PIC（16 位）
C51（8 位）

让我们考虑这个片段：

events = 0;
for (size_t i = 0; i < sizeof(array) / sizeof(array[0]); i++) {
    if (array[i] > threshold) 
        events++;
}

我的问题涉及循环计数器变量的类型，这里是size_t.

通常size_t应该足够大以解决我系统的所有内存。所以使用size_t可能会影响我的代码在某些架构上的性能，因为这个变量的宽度对于我拥有的数组的长度来说太大了。

有了这个假设，我应该更好地使用uint_fast16_t，因为我知道我array的元素少于 65k。

关心这篇文章是否有意义，或者我的编译器是否足够聪明来优化它？

我认为uint_fast16_t与size_t.

更具体地说明我的问题：

我是通过系统地为我的循环计数器（uint_fast8_t, uint_fast16_t, ...）使用正确的类型来提高我的代码的可移植性，还是我应该更喜欢使用size_t，因为在大多数情况下它不会在性能方面产生任何差异？

编辑

根据您的评论和评论，很明显大多数情况下，编译器会注册循环计数器，因此在两者之间进行选择size_t或uint_fast8_t无关紧要。

https://godbolt.org/g/pbPCrf

main: # @main
  mov rax, -80
  mov ecx, dword ptr [rip + threshold]
.LBB0_1: # =>This Inner Loop Header: Depth=1
  [....]
.LBB0_5: # in Loop: Header=BB0_1 Depth=1
  add rax, 8     # <----------- Kept in a register
  jne .LBB0_1
  jmp .LBB0_6
.LBB0_2: # in Loop: Header=BB0_1 Depth=1
  [....]
.LBB0_6:
  xor eax, eax
  ret

如果循环长度变得大于内部 CPU 寄存器，例如在 8 位微控制器上执行 512 个循环，这个问题可能会成为一个真正的问题。

score 1 · Accepted Answer

对于可移植代码，请使用size_t.

对于快速代码...嗯，这取决于您的编译器和处理器。如果您使用 16 位类型，它可能在 16 位处理器上运行得最快，但实际上比在 64 位处理器上慢。size_t在衡量性能之前，您不应该假设任何事情。

如果存在可演示的性能问题，我会使用size_t并且仅考虑进一步优化。

score 0 · Accepted Answer

对于 MCU，请使用您知道适合阵列大小的最小类型。如果您知道数组可能大于 256 字节，但从不大于 65536，那么uint_fast16_t确实是最合适的类型。

您的主要问题是具有扩展闪存 (>64kb) 的 16 位 MCU，提供 24 位或 32 位地址宽度。这是许多 16 苦味的标准。在这样的系统上，size_t它将是 32 位，因此使用起来很慢。

如果不关心 8 或 16 苦味的可移植性，那么我会使用size_t.

根据我的经验，许多嵌入式编译器不够聪明，无法优化代码以使用比所述类型更小的类型，即使它们可以在编译时扣除数组大小。

score 0 · Accepted Answer

与任何优化一样，首先为可移植的常见情况编写简单的代码（使用 size_t）。然后用其他类型查看您平台上的程序集。如果其中一种类型工作得更快或生成的代码明显更小，您可以为这些访问类型定义一个特殊的索引类型。例如，您可以使用近、远和巨大指针（以及相应的索引）的概念，但为了清楚起见使用固定宽度类型。

/* The compiler for my16bitmcu, cannot detect ranges to use appropriate types */
#if defined __MY16BITMCU__ /* replace with architecture's predefined macro */
  typedef uint16_t size8t, size16t; /* use uint8_t for size8t on 8bit */
  typedef uint32_t size32t;
  typedef uint64_t size64t;
  typedef int16_t ptrdiff8t, ptrdiff16t; /* use int8_t for ptrdiff8t on 8bit */
  typedef int32_t ptrdiff32t;
  typedef int64_t ptrdiff64t;
#else
  typedef size_t size8t, size16t, size32t, size64t;
  typedef ptrdiff_t ptrdiff8t, ptrdiff16t, ptrdiff32t, ptrdiff64t;
#endif

/** example usage: sum the total of an array
 ** using size8t for count reduces complexity on some 8/16 bit systems
 ** on other systems, size8t is the same as size_t
 **/
int sum_numbers(int *numbers, size8t count){
  int total = 0;
  while(count--) total += numbers[count];
  return total;
}

c - 在嵌入式 MCU 应用程序中，在 for 循环中使用 uint_fast16_t 或 size_t 更好吗？

3 回答 3

Related

Reference