c++ - 如何将 boost 多精度整数从小端转换为大端？

Question

如您所见，PUSH 指令的数据顺序错误，而以太坊是大端机器（地址正确表示，因为它们使用较小的类型）。
另一种方法是运行porosity.exe --code '0x61004b60026319e44e32' --disassm

u256类型定义为

using u256 = boost::multiprecision::number<boost::multiprecision::cpp_int_backend<256, 256, boost::multiprecision::unsigned_magnitude, boost::multiprecision::unchecked, void>>;

这是重现该错误的最小示例：

#include <sstream>
#include <iostream>
#include <iomanip>
#include <boost/multiprecision/cpp_int.hpp>

using u256 = boost::multiprecision::number<boost::multiprecision::cpp_int_backend<256, 256, boost::multiprecision::unsigned_magnitude, boost::multiprecision::unchecked, void>>;

int main() {
    std::stringstream stream;
    u256 data=0xFEDEFA;
    for (int i = 0; i<5; ++i) { // print only the first 5 digits
        uint8_t dataByte = int(data & 0xFF);
        data >>= 8;
        stream << std::setfill('0') << std::setw(sizeof(char) * 2) << std::hex << int(dataByte) << "  ";
    }
    std::cout << stream.str();
}

因此，数字被转换为字符串，每个字节之间有一个空格（并且只有第一个字节）。

但后来我遇到了字节顺序问题：字节以相反的顺序打印。我的意思是，例如，31722是8a 02 02在我的机器上编写的，并且02 02 8a在为大端目标编译时。

所以我没有调用哪个 boost 函数，我修改了代码：

#include <sstream>
#include <iostream>
#include <iomanip>
#include <boost/multiprecision/cpp_int.hpp>

using u256 = boost::multiprecision::number<boost::multiprecision::cpp_int_backend<256, 256, boost::multiprecision::unsigned_magnitude, boost::multiprecision::unchecked, void>>;

int main() {
    std::stringstream stream;
    u256 data=0xFEDEFA;
    for (int i = 0; i<5; ++i) {
        uint8_t dataByte = int(data >> ((32 - i - 1) * 8));
        stream << std::setfill('0') << std::setw(sizeof(char) * 2) << std::hex << int(dataByte) << "  ";
    }
    std::cout << stream.str();
}

现在，为什么我的 256 位整数大多打印为系列00 00 00 00 00？

score 1 · Accepted Answer

顺便说一句，这不是字节顺序问题；您没有对对象表示进行字节访问。您将其作为 256 位整数进行操作，并且只需使用 . 一次请求低 8 位data & 0xFF。

如果您确实知道目标 C 实现的字节序以及对象的数据布局，则boost可以使用unsigned char*.

您引入字节顺序的概念只是因为它与字节反转相关联，而这正是您想要做的。 但这确实效率低下，只需以另一种方式遍历 bigint 的字节即可。

我犹豫推荐一个具体的解决方案，因为我不知道什么会有效地编译。但是您可能想要这样的东西，而不是提前进行字节反转：

for (outer loop) {
    uint64_t chunk = data >> (64*3);  // grab the highest 64-bit chunk
    data <<= 64;   // and shift everything up
    // alternative: maybe keep a shift-count in a variable instead of modifying `data`

    // Then pick apart the chunk into its component bytes, in MSB first order
    for (int = 0 ; i<8 ; i++) {
        unsigned tmp = (chunk >> 56) & 0xFF;
        // do something with it
        chunk <<= 8;                   // bring the next byte to the top
    }
}

在内部循环中，比使用两次移位更有效的方法是使用旋转将高字节带到底部（for & 0xFF），同时将低字节向上移动。 C++ 中循环移位（旋转）操作的最佳实践

在外部循环中，IDK ifboost::multiprecision::number有任何内置的用于高效索引块的 API；如果是这样，使用它可能更有效。

我使用了嵌套循环，因为我认为data <<= 8编译效率不高，(data >> (256-8)) & 0xFF. 但这就是您从顶部而不是底部获取字节的方式。

另一种选择是将数字转换为字符串的标准技巧：将字符按降序存储到缓冲区中。一个 256 位（32 字节）的数字将占用 64 个十六进制数字，并且您需要在它们之间再有 32 个字节的空格。

例如：

  // 97 = 32 * 2 + 32, plus 1 byte for an implicit-length C string terminator
  // plus another 1 for an extra space
  char buf[98];            // small enough to use automatic storage
  char *outp = buf+96;     // pointer to the end
  *outp = 0;               // terminator
  const char *hex_lut = "0123456789abcdef";

  for (int i=0 ; i<32 ; i++) {
      uint8_t byte = data & 0xFF;
      *--outp = hex_lut[byte >> 4];
      *--outp = hex_lut[byte & 0xF];
      *--outp = ' ';
      data >>= 8;
  }
  // outp points at an extra ' '
  outp++;
  // outp points at the first byte of a string like  "12 ab cd"
  stream << outp;

如果你想把它分成几块来放一个换行符，你也可以这样做。

如果您对一次将 8、16 或 32 个字节的数据有效转换为十六进制感兴趣，请参阅如何将数字转换为十六进制？ 对于一些 x86 SIMD 方式。asm 应该很容易移植到 C++ 内部函数。（在从 little-endian 整数加载后，您可以使用 SIMD shuffle 处理将字节放入 MSB 优先打印顺序。）

您还可以使用 SIMD shuffle 将您的十六进制数字对进行空格分隔，然后再存储到您显然想要的内存中。

您添加的代码中的错误：

所以我在上面的循环之前添加了这段代码：
  for(unsigned int i=0,data,_data;i<33;++i)

unsigned i, data, _data声明新的类型变量，这些变量unsigned int遮蔽了之前的dataand声明_data。data该循环对循环范围或_data循环范围之外的影响为零。（并且包含 UB，因为您阅读_data并且data没有初始化它们。）

如果这些 vars 实际上仍然是u256外部范围的 vars，那么除了效率之外我没有看到明显的问题，但也许我也错过了明显的问题。我看起来并不难，因为使用 64x 256 位移位和 32x ORs 似乎是一个可怕的想法。有可能它可以完全优化，或者bswap在拥有它们的 ISA 上进行字节反转指令，但我对此表示怀疑。尤其是不通过boost::multiprecision::number包装函数的额外复杂性。

c++ - 如何将 boost 多精度整数从小端转换为大端？

1 回答 1

您添加的代码中的错误：

Related

Reference