以性能为导向的解决方案
因为以下陈述在 .NET 中是正确的
sizeof(bool) == 1
*(byte*)&someBool == 1
someBool 在哪里true
*(byte*)&someBool == 0
someBool 在哪里false
您可以回退到unsafe
代码和指针转换(因为 C# 不允许简单地转换bool
为byte
or int
)。
您的代码将看起来像这样
if (*(byte*)&bool1 + *(byte*)&bool2 + *(byte*)&bool3 > 1)
{
// do stuff
}
这里的好处是你没有任何额外的分支,这使得这个分支比明显的myBool ? 1 : 0
. 这里的缺点是使用unsafe
和指针,这在托管的 .NET 世界中通常不是一个受欢迎的解决方案。此外,可能会受到质疑的假设sizeof(bool) == 1
并不适用于所有语言,但至少在 C# .NET 中它是正确的。
如果指针的东西对你来说太烦人了,你总是可以将它隐藏在扩展方法中:
using System.Runtime.CompilerServices;
// ...
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static unsafe int ToInt(this bool b) => *(byte*)&b;
然后您的代码将变成更具可读性
if (bool1.ToInt() + bool2.ToInt() + bool3.ToInt() > 1)
{
// do stuff
}
显然,您可以随时将其与 LINQ 结合使用
if (myBools.Sum(b => b.ToInt()) > 1)
{
// do stuff
}
或者如果您重视性能而不是其他任何东西,这可能会更快
bool[] myBools = ...
fixed (bool* boolPtr = myBools)
{
byte* bytePtr = (byte*)boolPtr;
int numberOfTrueBools = 0;
// count all true booleans in the array
for (int i = 0; i < myBools.Length; numberOfTrueBools += bytePtr[i], i++);
// do something with your numberOfTrueBools ...
}
或者,如果您有一个巨大的输入数组,您甚至可以选择硬件加速 SIMD 解决方案......
using System.Runtime.CompilerServices;
using System.Runtime.Intrinsics;
using System.Runtime.Intrinsics.X86;
// ...
[MethodImpl(MethodImplOptions.AggressiveOptimization)]
public static unsafe int CountTrueBytesSIMD(this bool[] myBools)
{
// we need to get a pointer to the bool array to do our magic
fixed (bool* ptr = myBools)
{
// reinterpret all booleans as bytes
byte* bytePtr = (byte*)ptr;
// calculate the number of 32 bit integers that would fit into the array
int dwordLength = myBools.Length >> 2;
// for SIMD, allocate a result vector
Vector128<int> result = Vector128<int>.Zero;
// loop variable
int i = 0;
// it could be that SSSE3 isn't supported...
if (Ssse3.IsSupported)
{
// remember: we're assuming little endian!
// we need this mask to convert the byte vectors to valid int vectors
Vector128<int> cleanupMask = Vector128.Create(0x000000FF);
// iterate over the array processing 16 bytes at once
// TODO: you could even go to 32 byte chunks if AVX-2 is supported...
for (; i < dwordLength - Vector128<int>.Count; i += Vector128<int>.Count)
{
// load 16 bools / bytes from memory
Vector128<byte> v = Sse2.LoadVector128((byte*)((int*)bytePtr + i));
// now count the number of "true" bytes in every 32 bit integers
// 1. shift
Vector128<int> v0 = v.As<byte, int>();
Vector128<int> v1 = Sse2.ShiftRightLogical128BitLane(v, 1).As<byte, int>();
Vector128<int> v2 = Sse2.ShiftRightLogical128BitLane(v, 2).As<byte, int>();
Vector128<int> v3 = Sse2.ShiftRightLogical128BitLane(v, 3).As<byte, int>();
// 2. cleanup invalid bytes
v0 = Sse2.And(v0, cleanupMask);
v1 = Sse2.And(v1, cleanupMask);
v2 = Sse2.And(v2, cleanupMask);
v3 = Sse2.And(v3, cleanupMask);
// 3. add them together. We now have a vector of ints holding the number
// of "true" booleans / 0x01 bytes in their 32 bit memory region
Vector128<int> roundResult = Sse2.Add(Sse2.Add(Sse2.Add(v0, v1), v2), v3);
// 4 now add everything to the result
result = Sse2.Add(result, roundResult);
}
// reduce the result vector to a scalar by horizontally adding log_2(n) times
// where n is the number of words in out vector
result = Ssse3.HorizontalAdd(result, result);
result = Ssse3.HorizontalAdd(result, result);
}
int totalNumberOfTrueBools = result.ToScalar();
// now add all remaining booleans together
// (if the input array wasn't a multiple of 16 bytes or SSSE3 wasn't supported)
i <<= 2;
for (; i < myBools.Length; totalNumberOfTrueBools += bytePtr[i], i++);
return totalNumberOfTrueBools;
}
}