3

据说在串联方面char[]表现更好StringBuilder并且StringBuilder表现更好string

StringBuilder在我的测试中,使用和string循环内部没有显着差异。实际上char[]是最慢的。

我正在针对具有 44 列和 130,000 行的同一个表进行测试,查询是 select * from test

有人可以帮我看看我是否做错了什么?

以下是代码

//fetchByString(rd, fldCnt, delimiter, sw);            // duration: 3 seconds

//fetchByBuilder(rd, fldCnt, delimiter, sw, rsize);    // duration: 3 seconds

//fetchByCharArray(rd, fldCnt, delimiter, sw, rsize);  // duration: 7 seconds

private void fetchByString(OracleDataReader pReader, int pFldCnt, string pDelimiter, StreamWriter pWriter)
{
  while (pReader.Read())
  {
    string[] s = new string[pFldCnt];
    for (Int32 j = 0; j < pFldCnt; j++)
    {
      if (pReader.IsDBNull(j))
      {
        s[j] = "";
      }
      else
      {
        s[j] = pReader.GetValue(j).ToString();          // correct value
      }
    }
    pWriter.WriteLine(string.Join(pDelimiter, s));      
  }
}
private void fetchByBuilder(OracleDataReader pReader, int pFldCnt, string pDelimiter, StreamWriter pWriter, int pRowSzie)
{
  StringBuilder sb = new StringBuilder(pRowSzie);
  while (pReader.Read())
  {
    for (Int32 j = 0; j < pFldCnt; j++)
    {
      if (pReader.IsDBNull(j))
      {
        //sb.Append("");
        sb.Append(pDelimiter);
      }
      else
      {
        sb.Append(pReader.GetValue(j).ToString());          // correct value
        sb.Append(pDelimiter);
      }
    }
    pWriter.WriteLine(sb.ToString());
    sb.Clear();
  }
}
private void fetchByCharArray(OracleDataReader pReader, int pFldCnt, string pDelimiter, StreamWriter pWriter, int pRowSzie)
{
  char[] rowArray;
  int sofar; 
  while (pReader.Read())
  {
    rowArray = new char[pRowSzie];
    sofar = 0;
    for (Int32 j = 0; j < pFldCnt; j++)
    {
      if (pReader.IsDBNull(j))
      {
        pDelimiter.CopyTo(0, rowArray, sofar, pDelimiter.Length);
        sofar += pDelimiter.Length;
      }
      else
      {
        pReader.GetValue(j).ToString().CopyTo(0, rowArray, sofar, pReader.GetValue(j).ToString().Length);
        sofar += pReader.GetValue(j).ToString().Length;
        pDelimiter.CopyTo(0, rowArray, sofar, pDelimiter.Length);
        sofar += pDelimiter.Length;
      }
    }
    string a = new string(rowArray).TrimEnd('\0');
    pWriter.WriteLine(a);
  }
}
4

2 回答 2

5

StringBuilder 比 string concat 更受欢迎,因为 string concat 经常必须使用每个 + 运算符分配数据的临时中间副本,这会快速消耗大量内存并且需要多次复制数据。StringBuilder.Append() 在内部进行了优化,以避免多次复制或分配子段。所有工作都发生在 StringBuilder.ToString 中,当输出字符串的最终大小已知并且可以在一次调用中分配时。

您的测试用例没有使用字符串连接。您将一堆字符串片段分配到一个字符串数组中,然后调用 String.Join。这基本上就是 StringBuilder 在内部所做的。即使您消除了可能主导基准测试时间的数据 I/O 开销,我仍希望 String.Join() 和 StringBuilder.ToString() 产生类似的性能。

于 2012-11-02T17:33:58.980 回答
3

I'm not familiar with this claim, but there seems to be WAY more conversions going on in the char[] the way you've written it.

pReader.GetValue().ToString(), besides putting the value in a format that's not what you're working in (string instead of char[]), is happening 3 times in the char[] assignment as opposed to just 1 in the others. You should probably find some way to cast your 'true value' directly to a char[] to be valid. Otherwise from a benchmarking perspective you could theoretically be pulling down performance by introducing slowness from something else. I'm not asserting that's what's happening, but procedurally it's considered important. Even if you can't do that, I think you still might realize a small performance boost if you put in var stringRep = pReader.GetValue().ToString() and used stringRep instead of the associated GetValue/ToString call.

Incidentally, I'm not sure how you're timing this, but if you're not using the Stopwatch class you might look into it, just to be sure your timing is appropriate as well. It's basically made with this sort of benchmarking in mind. This would also allow you to actually isolate what you're trying to benchmark (the concatenation operations) without getting all that mess from the oracle reader in there as well.

于 2012-11-02T17:24:37.757 回答