我需要计算表示为char
数组的位集之间的汉明距离。这是一项核心操作,因此必须尽可能快。我有这样的事情:
const int N = 32; // 32 always
// returns the number of bits that are ones in a char
int countOnes_uchar8(unsigned char v);
// pa and pb point to arrays of N items
int hamming(const unsigned char *pa, const unsigned char *pb)
{
int ret = 0;
for(int i = 0; i < N; ++i, ++pa, ++pb)
{
ret += countOnes_uchar8(*pa ^ *pb);
}
return ret;
}
profiling 之后,发现对int
s 的操作比较快,所以写了:
const int N = 32; // 32 always
// returns the number of bits that are ones in a int of 32 bits
int countOnes_int32(unsigned int v);
// pa and pb point to arrays of N items
int hamming(const unsigned char *pa, const unsigned char *pb)
{
const unsigned int *qa = reinterpret_cast<const unsigned int*>(pa);
const unsigned int *qb = reinterpret_cast<const unsigned int*>(pb);
int ret = 0;
for(int i = 0; i < N / sizeof(unsigned int); ++i, ++qa, ++qb)
{
ret += countOnes_int32(*qa ^ *qb);
}
return ret;
}
问题
unsigned char *
1)从投到unsigned int *
安全吗?
2) 我在 32 位机器上工作,但我希望代码在 64 位机器上工作。sizeof(unsigned int)
是在两台机器上返回 4,还是在 64 位机器上返回 8 ?
3) 如果sizeof(unsigned int)
在 64 位机器中返回 4,我将如何在 64 位类型上操作long long
?