49

将可变长度十六进制字符串转换为"01A1"包含该数据的字节数组的最佳方法是什么。

即转换这个:

std::string = "01A1";

进入这个

char* hexArray;
int hexLength;

或这个

std::vector<char> hexArray;

这样当我将其写入文件hexdump -C时,我会得到包含01A1.

4

20 回答 20

41

这应该有效:

int char2int(char input)
{
  if(input >= '0' && input <= '9')
    return input - '0';
  if(input >= 'A' && input <= 'F')
    return input - 'A' + 10;
  if(input >= 'a' && input <= 'f')
    return input - 'a' + 10;
  throw std::invalid_argument("Invalid input string");
}

// This function assumes src to be a zero terminated sanitized string with
// an even number of [0-9a-f] characters, and target to be sufficiently large
void hex2bin(const char* src, char* target)
{
  while(*src && src[1])
  {
    *(target++) = char2int(*src)*16 + char2int(src[1]);
    src += 2;
  }
}

根据您的特定平台,可能还有一个标准实现。

于 2013-06-23T14:49:54.353 回答
39

此实现使用内置strtol函数来处理从文本到字节的实际转换,但适用于任何偶数长度的十六进制字符串。

std::vector<char> HexToBytes(const std::string& hex) {
  std::vector<char> bytes;

  for (unsigned int i = 0; i < hex.length(); i += 2) {
    std::string byteString = hex.substr(i, 2);
    char byte = (char) strtol(byteString.c_str(), NULL, 16);
    bytes.push_back(byte);
  }

  return bytes;
}
于 2015-06-02T21:07:49.570 回答
10

所以为了好玩,我很好奇我是否可以在编译时进行这种转换。它没有很多错误检查,并且是在 VS2015 中完成的,它还不支持 C++14 constexpr 函数(因此 HexCharToInt 的外观)。它接受一个 c 字符串数组,将字符对转换为单个字节,并将这些字节扩展为一个统一的初始化列表,用于初始化作为模板参数提供的 T 类型。T 可以替换为 std::array 之类的东西以自动返回一个数组。

#include <cstdint>
#include <initializer_list>
#include <stdexcept>
#include <utility>

/* Quick and dirty conversion from a single character to its hex equivelent */
constexpr std::uint8_t HexCharToInt(char Input)
{
    return
    ((Input >= 'a') && (Input <= 'f'))
    ? (Input - 87)
    : ((Input >= 'A') && (Input <= 'F'))
    ? (Input - 55)
    : ((Input >= '0') && (Input <= '9'))
    ? (Input - 48)
    : throw std::exception{};
}

/* Position the characters into the appropriate nibble */
constexpr std::uint8_t HexChar(char High, char Low)
{
    return (HexCharToInt(High) << 4) | (HexCharToInt(Low));
}

/* Adapter that performs sets of 2 characters into a single byte and combine the results into a uniform initialization list used to initialize T */
template <typename T, std::size_t Length, std::size_t ... Index>
constexpr T HexString(const char (&Input)[Length], const std::index_sequence<Index...>&)
{
    return T{HexChar(Input[(Index * 2)], Input[((Index * 2) + 1)])...};
}

/* Entry function */
template <typename T, std::size_t Length>
constexpr T HexString(const char (&Input)[Length])
{
    return HexString<T>(Input, std::make_index_sequence<(Length / 2)>{});
}

constexpr auto Y = KS::Utility::HexString<std::array<std::uint8_t, 3>>("ABCDEF");
于 2016-02-11T19:18:09.563 回答
8

您可以使用提升:

#include <boost/algorithm/hex.hpp>

char bytes[60] = {0}; 
std::string hash = boost::algorithm::unhex(std::string("313233343536373839")); 
std::copy(hash.begin(), hash.end(), bytes);
于 2019-11-29T11:06:13.260 回答
5

你说的是“可变长度”。你的意思是多变?

对于适合 unsigned long 的十六进制字符串,我一直很喜欢 C 函数strtoul。使其转换十六进制传递 16 作为基数值。

代码可能如下所示:

#include <cstdlib>
std::string str = "01a1";
unsigned long val = strtoul(str.c_str(), 0, 16);
于 2014-06-21T01:24:44.013 回答
5

如果你想使用 OpenSSL 来做这件事,我发现了一个绝妙的技巧:

BIGNUM *input = BN_new();
int input_length = BN_hex2bn(&input, argv[2]);
input_length = (input_length + 1) / 2; // BN_hex2bn() returns number of hex digits
unsigned char *input_buffer = (unsigned char*)malloc(input_length);
retval = BN_bn2bin(input, input_buffer);

只要确保去掉字符串的任何前导“0x”即可。

于 2015-02-03T19:10:50.260 回答
4

这可以通过 a 来完成stringstream,您只需将值存储在中间数字类型中,例如 a int

  std::string test = "01A1"; // assuming this is an even length string
  char bytes[test.length()/2];
  stringstream converter;
  for(int i = 0; i < test.length(); i+=2)
  {
      converter << std::hex << test.substr(i,2);
      int byte;
      converter >> byte;
      bytes[i/2] = byte & 0xFF;
      converter.str(std::string());
      converter.clear();
  }
于 2018-02-23T19:37:06.047 回答
2

我会使用一个标准函数,比如sscanf将字符串读入一个无符号整数,然后你已经在内存中拥有了你需要的字节。如果你在一个大端机器上,你可以memcpy从第一个非零字节写出 () 整数的内存。但是,您通常不能安全地假设这一点,因此您可以使用一些位掩码和移位来获取字节。

const char* src = "01A1";
char hexArray[256] = {0};
int hexLength = 0;

// read in the string
unsigned int hex = 0;
sscanf(src, "%x", &hex);

// write it out
for (unsigned int mask = 0xff000000, bitPos=24; mask; mask>>=8, bitPos-=8) {
    unsigned int currByte = hex & mask;
    if (currByte || hexLength) {
        hexArray[hexLength++] = currByte>>bitPos;
    }
}
于 2013-06-23T20:44:17.920 回答
2

如果您的目标是速度,我在这里有一个编码器和解码器的 AVX2 SIMD 实现:https ://github.com/zbjornson/fast-hex 。这些基准测试比最快的标量实现快约 12 倍。

于 2017-12-28T18:13:11.657 回答
2

C++11 变体(使用 gcc 4.7 - little endian 格式):

    #include <string>
    #include <vector>

    std::vector<uint8_t> decodeHex(const std::string & source)
    {
        if ( std::string::npos != source.find_first_not_of("0123456789ABCDEFabcdef") )
        {
            // you can throw exception here
            return {};
        }

        union
        {
            uint64_t binary;
            char byte[8];
        } value{};

        auto size = source.size(), offset = (size % 16);
        std::vector<uint8_t> binary{};
        binary.reserve((size + 1) / 2);

        if ( offset )
        {
            value.binary = std::stoull(source.substr(0, offset), nullptr, 16);

            for ( auto index = (offset + 1) / 2; index--; )
            {
                binary.emplace_back(value.byte[index]);
            }
        }

        for ( ; offset < size; offset += 16 )
        {
            value.binary = std::stoull(source.substr(offset, 16), nullptr, 16);
            for ( auto index = 8; index--; )
            {
                binary.emplace_back(value.byte[index]);
            }
        }

        return binary;
    }

Crypto++ 变体(使用 gcc 4.7):

#include <string>
#include <vector>

#include <crypto++/filters.h>
#include <crypto++/hex.h>

std::vector<unsigned char> decodeHex(const std::string & source)
{
    std::string hexCode;
    CryptoPP::StringSource(
              source, true,
              new CryptoPP::HexDecoder(new CryptoPP::StringSink(hexCode)));

    return std::vector<unsigned char>(hexCode.begin(), hexCode.end());
}

请注意,第一个变体比第二个变体快两倍,同时适用于奇数和偶数个半字节(“a56ac”的结果是 {0x0a, 0x56, 0xac})。如果有奇数个 nibbels(“a56ac”的结果是 {0xa5, 0x6a}),Crypto++ 会丢弃最后一个,并静默跳过无效的十六进制字符(“a5sac”的结果是 {0xa5, 0xac})。

于 2016-04-06T12:54:42.603 回答
2
#include <iostream>
#include <sstream>
#include <vector>

int main() {
    std::string s("313233");
    char delim = ',';
    int len = s.size();
    for(int i = 2; i < len; i += 3, ++len) s.insert(i, 1, delim);
    std::istringstream is(s);
    std::ostringstream os;
    is >> std::hex;
    int n;
    while (is >> n) {
        char c = (char)n;
        os << std::string(&c, 1);
        if(is.peek() == delim) is.ignore();
    }

    // std::string form
    std::string byte_string = os.str();
    std::cout << byte_string << std::endl;
    printf("%s\n", byte_string.c_str());

    // std::vector form
    std::vector<char> byte_vector(byte_string.begin(), byte_string.end());
    byte_vector.push_back('\0'); // needed for a c-string
    printf("%s\n", byte_vector.data());
}

输出是

123
123
123

'1' == 0x31 等。

于 2016-06-17T20:41:33.493 回答
2

有人提到使用 sscanf 来做到这一点,但没有说如何。就是这样。它很有用,因为它也适用于 C 和 C++ 的古老版本,甚至适用于微控制器的大多数嵌入式 C 或 C++ 版本。

当转换为字节时,此示例中的十六进制字符串解析为 ASCII 文本“Hello there!” 然后打印。

#include <stdio.h>
int main ()
{
    char hexdata[] = "48656c6c6f20746865726521";
    char bytedata[20]{};
    for(int j = 0; j < sizeof(hexdata) / 2; j++) {
        sscanf(hexdata + j * 2, "%02hhX", bytedata + j);
    }
    printf ("%s -> %s\n", hexdata, bytedata);
    return 0;
}
于 2019-11-03T20:54:26.433 回答
1

我如何在编译时做到这一点

#pragma once

#include <memory>
#include <iostream>
#include <string>
#include <array>

#define DELIMITING_WILDCARD ' '

//  @sean :)
constexpr int _char_to_int( char ch )
{
    if( ch >= '0' && ch <= '9' )
        return ch - '0';

    if( ch >= 'A' && ch <= 'F' )
        return ch - 'A' + 10;

    return ch - 'a' + 10;
};

template <char wildcard, typename T, size_t N = sizeof( T )>
constexpr size_t _count_wildcard( T &&str )
{
    size_t count = 1u;
    for( const auto &character : str )
    {
        if( character == wildcard )
        {
            ++count;
        }
    }

    return count;
}

//  construct a base16 hex and emplace it at make_count
//  change 16 to 256 if u want the result to be when:
//  sig[0] == 0xA && sig[1] == 0xB = 0xA0B
//  or leave as is for the scenario to return 0xAB
#define CONCATE_HEX_FACTOR 16
#define CONCATE_HEX(a, b) ( CONCATE_HEX_FACTOR * ( a ) + ( b ) )

template
<   char skip_wildcard,
    //  How many occurances of a delimiting wildcard do we find in sig
    size_t delimiter_count,
    typename T, size_t N = sizeof( T )>
    constexpr auto _make_array( T &&sig )
{
    static_assert( delimiter_count > 0, "this is a logical error, delimiter count can't be of size 0" );
    static_assert( N > 1, "sig length must be bigger than 1" );

    //  Resulting byte array, for delimiter_count skips we should have delimiter_count integers
    std::array<int, delimiter_count> ret{};

    //  List of skips that point to the position of the delimiter wildcard in skip
    std::array<size_t, delimiter_count> skips{};

    //  Current skip
    size_t skip_count = 0u;

    //  Character count, traversed for skip
    size_t skip_traversed_character_count = 0u;
    for( size_t i = 0u; i < N; ++i )
    {
        if( sig[i] == DELIMITING_WILDCARD )
        {
            skips[skip_count] = skip_traversed_character_count;
            ++skip_count;
        }

        ++skip_traversed_character_count;
    }

    //  Finally traversed character count
    size_t traversed_character_count = 0u;

    //  Make count (we will supposedly have at least an instance in our return array)
    size_t make_count = 1u;

    //  Traverse signature
    for( size_t i = 0u; i < N; ++i )
    {
        //  Read before
        if( i == 0u )
        {
            //  We don't care about this, and we don't want to use 0
            if( sig[0u] == skip_wildcard )
            {
                ret[0u] = -1;
                continue;
            }

            ret[0u] = CONCATE_HEX( _char_to_int( sig[0u] ), _char_to_int( sig[1u] ) );
            continue;
        }

        //  Make result by skip data
        for( const auto &skip : skips )
        {
            if( ( skip == i ) && skip < N - 1u )
            {
                //  We don't care about this, and we don't want to use 0
                if( sig[i + 1u] == skip_wildcard )
                {
                    ret[make_count] = -1;
                    ++make_count;
                    continue;
                }

                ret[make_count] = CONCATE_HEX( _char_to_int( sig[i + 1u] ), _char_to_int( sig[i + 2u] ) );
                ++make_count;
            }
        }
    }

    return ret;
}

#define SKIP_WILDCARD '?'
#define BUILD_ARRAY(a) _make_array<SKIP_WILDCARD, _count_wildcard<DELIMITING_WILDCARD>( a )>( a )
#define BUILD_ARRAY_MV(a) _make_array<SKIP_WILDCARD, _count_wildcard<DELIMITING_WILDCARD>( std::move( a ) )>( std::move( a ) )

//  -----
//  usage
//  -----
template <int n>
constexpr int combine_two()
{
    constexpr auto numbers = BUILD_ARRAY( "55 8B EC 83 E4 F8 8B 4D 08 BA ? ? ? ? E8 ? ? ? ? 85 C0 75 12 ?" );
    constexpr int number = numbers[0];
    constexpr int number_now = n + number;
    return number_now;
}

int main()
{
    constexpr auto shit = BUILD_ARRAY( "?? AA BB CC DD ? ? ? 02 31 32" );
    for( const auto &hex : shit )
    {
        printf( "%x ", hex );
    }

    combine_two<3>();
    constexpr auto saaahhah = combine_two<3>();
    static_assert( combine_two<3>() == 88 );
    static_assert( combine_two<3>() == saaahhah );
    printf( "\n%d", saaahhah );
}

方法也可用于运行时,但为此您可能更喜欢其他更快的方法。

于 2021-05-05T12:21:46.653 回答
1

我修改了 TheoretiCAL 的代码

uint8_t buf[32] = {};
std::string hex = "0123";
while (hex.length() % 2)
    hex = "0" + hex;
std::stringstream stream;
stream << std::hex << hex;

for (size_t i= 0; i <sizeof(buf); i++)
    stream >> buf[i];
于 2019-04-21T13:17:40.110 回答
1
#include <iostream>

using byte = unsigned char;

static int charToInt(char c) {
    if (c >= '0' && c <= '9') {
        return c - '0';
    }
    if (c >= 'A' && c <= 'F') {
        return c - 'A' + 10;
    }
    if (c >= 'a' && c <= 'f') {
        return c - 'a' + 10;
    }
    return -1;
}

// Decodes specified HEX string to bytes array. Specified nBytes is length of bytes
// array. Returns -1 if fails to decode any of bytes. Returns number of bytes decoded
// on success. Maximum number of bytes decoded will be equal to nBytes. It is assumed
// that specified string is '\0' terminated.
int hexStringToBytes(const char* str, byte* bytes, int nBytes) {
    int nDecoded {0};
    for (int i {0}; str[i] != '\0' && nDecoded < nBytes; i += 2, nDecoded += 1) {
        if (str[i + 1] != '\0') {
            int m {charToInt(str[i])};
            int n {charToInt(str[i + 1])};
            if (m != -1 && n != -1) {
                bytes[nDecoded] = (m << 4) | n;
            } else {
                return -1;
            }
        } else {
            return -1;
        }
    }
    return nDecoded;
}

int main(int argc, char* argv[]) {
    if (argc < 2) {
        return 1;
    }

    byte bytes[0x100];
    int ret {hexStringToBytes(argv[1], bytes, 0x100)};
    if (ret < 0) {
        return 1;
    }
    std::cout << "number of bytes: " << ret << "\n" << std::hex;
    for (int i {0}; i < ret; ++i) {
        if (bytes[i] < 0x10) {
            std::cout << "0";
        }
        std::cout << (bytes[i] & 0xff);
    }
    std::cout << "\n";

    return 0;
}
于 2018-07-18T11:51:14.433 回答
0

十六进制到字符转换的困难在于十六进制数字成对工作,f.ex:3132 或 A0FF。因此假设偶数个十六进制数字。但是,奇数位数可能是完全有效的,例如:332 和 AFF,应该理解为 0332 和 0AFF。

我建议对 Niels Keurentjes hex2bin() 函数进行改进。首先我们计算有效的十六进制数字的数量。正如我们必须计算的那样,让我们​​也控制缓冲区大小:

void hex2bin(const char* src, char* target, size_t size_target)
{
    int countdgts=0;    // count hex digits
    for (const char *p=src; *p && isxdigit(*p); p++) 
        countdgts++;                            
    if ((countdgts+1)/2+1>size_target)
        throw exception("Risk of buffer overflow"); 

顺便说一句,要使用isxdigit()你必须#include <cctype>.
一旦我们知道有多少个数字,我们就可以确定第一个数字是否是较高的数字(仅对)或不是(第一个数字不是对)。

bool ishi = !(countdgts%2);         

然后我们可以逐位循环,使用 bin shift << 和 bin or 组合每一对,并在每次迭代时切换“高”指示器:

    for (*target=0; *src; ishi = !ishi)  {    
        char tmp = char2int(*src++);    // hex digit on 4 lower bits
        if (ishi)
            *target = (tmp << 4);   // high:  shift by 4
        else *target++ |= tmp;      // low:  complete previous  
    } 
  *target=0;    // null terminated target (if desired)
}
于 2014-06-21T00:16:46.970 回答
0

如果您可以使您的数据看起来像这样,例如“0x01”、“0xA1”的数组,那么您可以迭代您的数组并使用 sscanf 创建值数组

unsigned int result;
sscanf(data, "%x", &result);         
于 2013-06-24T03:40:58.363 回答
0

我找到了这个问题,但接受的答案对我来说看起来不像是解决任务的 C++ 方式(这并不意味着这是一个糟糕的答案或任何东西,只是解释了添加这个问题的动机)。我想起了这个不错的答案,并决定实现类似的东西。这是我最终得到的完整代码(它也适用于std::wstring):

#include <cctype>
#include <cstdlib>

#include <algorithm>
#include <iostream>
#include <iterator>
#include <ostream>
#include <stdexcept>
#include <string>
#include <vector>

template <typename OutputIt>
class hex_ostream_iterator :
    public std::iterator<std::output_iterator_tag, void, void, void, void>
{
    OutputIt out;
    int digitCount;
    int number;

public:
    hex_ostream_iterator(OutputIt out) : out(out), digitCount(0), number(0)
    {
    }

    hex_ostream_iterator<OutputIt> &
    operator=(char c)
    {
        number = (number << 4) | char2int(c);
        digitCount++;

        if (digitCount == 2) {
            digitCount = 0;
            *out++ = number;
            number = 0;
        }
        return *this;
    }

    hex_ostream_iterator<OutputIt> &
    operator*()
    {
        return *this;
    }

    hex_ostream_iterator<OutputIt> &
    operator++()
    {
        return *this;
    }

    hex_ostream_iterator<OutputIt> &
    operator++(int)
    {
        return *this;
    }

private:
    int
    char2int(char c)
    {
        static const std::string HEX_CHARS = "0123456789abcdef";

        const char lowerC = std::tolower(c);
        const std::string::size_type pos = HEX_CHARS.find_first_of(lowerC);
        if (pos == std::string::npos) {
            throw std::runtime_error(std::string("Not a hex digit: ") + c);
        }
        return pos;
    }
};

template <typename OutputIt>
hex_ostream_iterator<OutputIt>
hex_iterator(OutputIt out)
{
    return hex_ostream_iterator<OutputIt>(out);
}

template <typename InputIt, typename OutputIt>
hex_ostream_iterator<OutputIt>
from_hex_string(InputIt first, InputIt last, OutputIt out)
{
    if (std::distance(first, last) % 2 == 1) {
        *out = '0';
        ++out;
    }
    return std::copy(first, last, out);
}

int
main(int argc, char *argv[])
{
    if (argc != 2) {
        std::cout << "Usage: " << argv[0] << " hexstring" << std::endl;
        return EXIT_FAILURE;
    }

    const std::string input = argv[1];
    std::vector<unsigned char> bytes;
    from_hex_string(input.begin(), input.end(),
                    hex_iterator(std::back_inserter(bytes)));

    typedef std::ostream_iterator<unsigned char> osit;
    std::copy(bytes.begin(), bytes.end(), osit(std::cout));

    return EXIT_SUCCESS;
}

和输出./hex2bytes 61a062a063 | hexdump -C

00000000  61 a0 62 a0 63                                    |a.b.c|
00000005

./hex2bytes 6a062a063 | hexdump -C(注意奇数个字符):

00000000  06 a0 62 a0 63                                    |..b.c|
00000005
于 2014-01-11T20:37:38.963 回答
0

输入:“303132”,输出:“012”。输入字符串可以是奇数或偶数长度。

char char2int(char input)
{
    if (input >= '0' && input <= '9')
        return input - '0';
    if (input >= 'A' && input <= 'F')
        return input - 'A' + 10;
    if (input >= 'a' && input <= 'f')
        return input - 'a' + 10;

    throw std::runtime_error("Incorrect symbol in hex string");
};

string hex2str(string &hex)
{
    string out;
    out.resize(hex.size() / 2 + hex.size() % 2);

    string::iterator it = hex.begin();
    string::iterator out_it = out.begin();
    if (hex.size() % 2 != 0) {
        *out_it++ = char(char2int(*it++));
    }

    for (; it < hex.end() - 1; it++) {
        *out_it++ = char2int(*it++) << 4 | char2int(*it);
    };

    return out;
}
于 2017-09-25T10:30:22.610 回答
0

与此处的其他一些答案非常相似,这就是我所采用的:

typedef uint8_t BYTE;

BYTE* ByteUtils::HexStringToBytes(BYTE* HexString, int ArrayLength)
{
  BYTE* returnBytes;
  returnBytes = (BYTE*) malloc(ArrayLength/2);
  int j=0;

  for(int i = 0; i < ArrayLength; i++)
  {
    if(i % 2 == 0)
    {
      int valueHigh = (int)(*(HexString+i));
      int valueLow =  (int)(*(HexString+i+1));

      valueHigh = ByteUtils::HexAsciiToDec(valueHigh);
      valueLow =  ByteUtils::HexAsciiToDec(valueLow);

      valueHigh *= 16;
      int total = valueHigh + valueLow;
      *(returnBytes+j++) = (BYTE)total;
    }
  }
  return returnBytes;
}

int ByteUtils::HexAsciiToDec(int value)
{
  if(value > 47 && value < 59)
  {
    value -= 48;
  }
  else if(value > 96 && value < 103)
  {
    value -= 97;
    value += 10;
  }
  else if(value > 64 && value < 71)
  {
    value -= 65;
    value += 10;
  }
  else
  {
    value = 0;
  }
  return value;
}
于 2018-08-21T12:12:09.887 回答