c++ - 通过 char 字符串进行优化迭代

Question

这源于其他问题。如果我们有：

const std::string& S = ...;
int freq[CHAR_MAX-CHAR_MIN+1]={0};

以下四个循环是否等效？你更喜欢什么？

for (int           c: S) ++freq[c-CHAR_MIN];  // (1)
for (char          c: S) ++freq[c-CHAR_MIN];  // (2)
for (unsigned      c: S) ++freq[c];           // (3) <-- BAD!
for (unsigned char c: S) ++freq[c];           // (4)

score 3 · Accepted Answer

2 是最好的选择，因为它清楚地表明了您打算将每个字符用于什么（很简单：作为一个字符）。这个含义在 1、3 和 4 中丢失了。正如 Rapptz 所提到的，for (auto c : S)如果你有一个支持它的编译器（C++11 标准），你也可以使用它。

此外，将 char 存储在 int (1)、unsigned int (3) 或 unsigned char (4) 中没有意义，因为它们可以存储大于 char 的值。

score 1 · Accepted Answer

使其正确通用：

#include <limits>
#include <vector>

template <typename C, typename T = typename C::value_type>
  std::vector<unsigned> histogram(C const& container)
{
    std::vector<unsigned> result(std::numeric_limits<T>::max() - std::numeric_limits<T>::min());
    for(auto& el : container)
        result[el - std::numeric_limits<T>::min()]++;

    return result;
}

T现在，对于大元素类型（无论输入长度如何），这将导致无用的大结果向量。考虑使用地图：

// for very large char types, consider
#include <map>

template <typename C, typename T = typename C::value_type>
  std::map<T, unsigned> histogram_ex(C const& container)
{
    std::map<T, unsigned> result;

    for(auto& el : container)
        result[el]++;

    return result;
}

一些使用演示：

#include <algorithm>
#include <string>
#include <iostream>

int main()
{
     auto v = histogram   (std::string ("hello world"));
     auto m = histogram_ex(std::wstring(L"hello world"));

     std::wcout << L"Sum of frequencies: " << std::accumulate(v.begin(), v.end(), 0) << "\n";

     for (auto p : m)
         std::wcout << L"'" << p.first << L"': " << p.second << L'\n';
}

印刷：

Sum of frequencies: 11
' ': 1
'd': 1
'e': 1
'h': 1
'l': 3
'o': 2
'r': 1
'w': 1

在 Coliru 上看到这一切

score 0 · Accepted Answer

我自己找到了答案，到目前为止还没有其他人的正确答案，所以我正在回答我自己的问题。

循环不等效。在 (3) 中，如果char是有符号的且值为-1，它将被转换为unsigned带有符号扩展的值4294967295。

至于个人偏好哪个循环更好，我更喜欢（4），因为它不依赖于<limits.h>.

EDIT
(3) 在非双补系统上可能无法正常工作。所以（1）和（2）更好。char转换为int（或转换为）不应该有任何性能被无意听到size_t。

c++ - 通过 char 字符串进行优化迭代

3 回答 3

Related

Reference