c - printf() 的速度

Question

我在使用time.h库的C语言中获得了一些乐趣，试图测量一些基本函数的时钟滴答数，只是为了弄清楚它们实际上有多快。我使用了clock() 函数。在这种情况下，我正在测量printf()函数。

看看我的程序：

#include <stdio.h>
#include <time.h>

void main()
{
    const int LIMIT = 2000;
    const int LOOP = 20;
    int results[LOOP];

    for(int i=0; i<LOOP; i++)
    {
        int j;
        clock_t time01 = clock();

        for(j=1; j<LIMIT; j++)
        {
            printf("a");
        }

        clock_t time02 = clock();
        results[i] = (int) (time02 - time01);
    }

    for(int i=0; i<LOOP; i++)
    {
        printf("\nCLOCK TIME: %d.", results[i]);        
    }
    getchar();
}

该程序基本上只是计算 2000 次的时钟滴答数的 20 倍，称为 printf("a") 函数。

我不明白的奇怪的事情是结果。大多数时候，即使在进行其他测试时，我也会随机获得两组结果：

CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 31.
CLOCK TIME: 47.
CLOCK TIME: 31.

我不明白编译器究竟是如何处理该函数的。我猜对%字符进行了一些测试，但这不会产生影响。看起来更像是编译器在内存中做某事......（？）有谁知道编译这段代码的确切背景，或者为什么会出现上面提到的差异？或者至少有一些对我有帮助的链接？

谢谢你。

score 2 · Accepted Answer

我至少能想到两个可能的原因：

您的时钟分辨率有限。
printf偶尔会刷新它的缓冲区。

score 1 · Accepted Answer

一些编译器（特别是gcc最近 Linux 发行版上的最新版本，当使用优化时-O2）能够优化成printf("a")非常类似于putchar()

但是大部分时间都花在内核做write系统调用上。

score -1 · Accepted Answer

手册页clock说它返回一个

程序使用的处理器时间的近似值

这个近似值基于著名的时间戳计数器。正如维基百科所说：

它计算自复位以来的周期数

可悲的是，如今，这个计数器可能因内核而异。

无法保证单个主板上多个 CPU 的时间戳计数器会同步。

所以要小心把你的代码锁定在某个cpu上，否则，你会不断有奇怪的结果。而且由于您似乎搜索精确的结果，您可以使用此代码而不是clockcall ：

  uint64_t rdtsc(void) {
    uint32_t lo, hi;
    __asm__ __volatile__ (      // serialize
    "xorl %%eax,%%eax \n        cpuid"
    ::: "%rax", "%rbx", "%rcx", "%rdx");
    /* We cannot use "=A", since this would use %rax on x86_64 and return only the lower 32bits of the TSC */
    __asm__ __volatile__ ("rdtsc" : "=a" (lo), "=d" (hi));
    return (uint64_t)hi << 32 | lo;
  }

c - printf() 的速度

3 回答 3

Related

Reference