c++ - 如何更有效地遍历存储 {int, short, ushort,...} 的字符数组？

Question

我有一个char data[len]从二进制文件中读取的解压缩数据填充。我知道data只能是这些类型：我知道代表 ( )char, uchar, short, ushort, int, uint, float, double所需的确切位数。elesize = {8, 16, 32, 64}

我只想遍历数据列表，比如说，找到max()给min()定数字的出现次数。我想在不为内存空间问题创建另一个数组的情况下做到这一点。

我想出了以下内容，但是例如对于len == 34560000

所以我想知道是否有人有“单线”或更有效的方法来做到这一点（C 或 C++）。

char data[len];
double mymax = -std::numeric_limits<double>::max()
for (size_t i=0; i<len; i += elesize)
{
    double x;
    char *r = data+i;
    if (elementtype == "char")
        x = static_cast<double>(*r);
    else if (elementtype == "uchar")
        x = static_cast<double>(*((unsigned char *)r));
    else if (elementtype == "short")
        x = static_cast<double>(*((int16_t *)r));
    else if (elementtype == "ushort")
        x = static_cast<double>(*((uint16_t *)r));
    else if (elementtype == "int")
        x = static_cast<double>(*((int32_t *)r));
    else if (elementtype == "uint")
        x = static_cast<double>(*((uint32_t *)r));
    else if (elementtype == "float")
        x = static_cast<double>(*((float *)r));
    else if (elementtype == "double")
        x = *((double *)r);
    if (x > mymax)
        mymax = x;
}

score 1 · Accepted Answer

模板应该做得很好：

#include <algorithm>

template <typename T>
T read_and_advance(const unsigned char * & p)
{
  T x;
  unsigned char * const px = reinterpret_cast<unsigned char *>(&x);

  std::copy(p, p + sizeof(T), px);
  P += sizeof(T);

  return x;
}

用法：

const unsigned char * p = the_data;
unsigned int max = 0;

while (p != the_data + data_length)
{
  max = std::max(max, read_and_advance<unsigned int>(p));
}

放弃这个，我最初认为问题是针对 C 的。

这是一个宏：

#define READ_TYPE(T, buf, res) do { memcpy(&res, buf, sizeof(T)); buf += sizeof(T); } while (false)

用法：

int max = 0; unsigned char * p = data; while (true) { unsigned int res; READ_TYPE(unsigned int, p, res); if (res > max) max = res; }

但是，您并没有真正绕过指定type。在 C++ 中，这可以更优雅地完成。

或者，您可以将它们全部包装在一个中：

#define READ_TYPE_AND_MAX(T, buf, max) \ do { T x; memcpy(&x, buf, sizeof(T)); \ buf += sizeof(T); \ if (max < x) max = x; \ } while (false) // Usage: unsigned int max = 0; unsigned char * p = data; while (true) { READ_TYPE_AND_MAX(unsigned int, p, max); }

score 0 · Accepted Answer

鉴于这elementtype是循环不变的，您最好只在for. 顺便说一句，我希望elementtype是类型std::string或有意义地与字符串文字进行比较的东西。

最终，我会编写一个模板函数来执行整个处理循环，然后根据elementtype.

score 0 · Accepted Answer

将条件代码放在循环之外，这样循环运行得很快。尝试这样的事情：

char data[len];
double mymax = -std::numeric_limits<double>::max()
double x;
if (elementtype == "char") {
  for (size_t i=0; i<len; i += elesize) {
    char *r = data+i;
    x = static_cast<double>(*r);
    if (x > mymax)  mymax = x;
  }
}else if (elementtype == "uchar") {
  for (size_t i=0; i<len; i += elesize) {
    char *r = data+i;
    x = static_cast<double>(*((unsigned char *)r));
    if (x > mymax)  mymax = x;
  }
}else if (elementtype == "short")

..etc..etc

score 0 · Accepted Answer

正如其他人指出的那样，您应该只检查一次类型。然后你应该调用只处理一种类型的适当子函数。当 elementtype 不是 double 时，您也不应该转换为 double 来与 my_max 进行比较。否则，您将不必要地转换为双精度并与双精度进行比较。如果 elementtype 是 uint，那么你永远不应该将任何东西转换为 double，只需与也是 uint 的 my_max var 进行比较。

c++ - 如何更有效地遍历存储 {int, short, ushort,...} 的字符数组？

4 回答 4

Related

Reference