c - 如何使用自定义字符串快速填充缓冲区？

Question

我的要求是编写一个具有某些给定模式（3 个字节char）的文件。文件可以是任意大小，例如，如果文件大小为 10 字节，则"abcabcabca"如果给定模式为，则内容将为"abc".

现在，memset不适用于多个字符。我的问题是：

用这样的字符串填充缓冲区然后将其提供给 write 系统调用的最快方法是什么？

我可以想到以下步骤

打开文件。
使用 for 循环和写入模式填充缓冲区（如果未指定，则最少 1024 个字符）。
循环直到文件大小。

我不确定如何用给定的模式快速填充缓冲区。

score 3 · Accepted Answer

OP的算法是健全的，只需要实现它。

任何花费在写入缓冲区、使用循环memcpy()等的时间都被文件 I/O 时间所淹没。所以只需要适度优化缓冲区的形成。

这实际上是一个很好的机会来分析你的代码。尝试像下面的 2 一样计时 2 实现，看看时差的结果。

int Fill_Basic(FILE *outf, size_t Length, char a, char b char c) {
  while (Length > 0) {
    if (Length > 0) {
      Length--;
      fputc(outf, a);
      }
    if (Length > 0) {
      Length--;
      fputc(outf, b);
      }
    if (Length > 0) {
      Length--;
      fputc(outf, c);
      }
    }
  return ferror(outf);
  }

int Fill_Faster(FILE *outf, size_t Length, char a, char b char c) {
  // A trick is to provide fwrite() with a "nice" buffer size.
  // This is often a power of 2 and is highly platform dependent, but a reasonable assertion.
  // Profiling would help access this.
  // Let's assume 1024
  size_t bsize = min(Length, 1024);
  char buf[bsize + 2];  // Allocate (3-1) extra.
  for (char *p = buf; p < &buf[bsize]; ) {
    *p++ = a;
    *p++ = b;
    *p++ = c;
    }
  // Optimization: each time through the loop, provide the buffer shifted as needed
  //  1st time "abcabc..."
  //  2nd time "bcabca..."
  //  3rd time "cabcab..."
  //  4th time "abcabc..."
  size_t Offset = 0;
  while (Length > 0) {
    for (size_t i=0; i<3; i++) {
      size_t t = min(Length, bsize);
      if (t != fwrite(&buffer[Offset], 1, t, outf)) handle_error();
      Length -= t;
      Offset += t;
      Offset %= 3;
      }
    }
  return ferror(outf);
  }

c - 如何使用自定义字符串快速填充缓冲区？

1 回答 1

Related

Reference