刚刚看到这个问题 - 也许你已经解决了它,但我仍在为可能需要处理这种情况的其他程序员编写逻辑。
下面给出了解决方案(采用 Intel ASM 格式)。它包括三个步骤:
步骤0:将8位掩码转换为64位掩码,原始掩码中的每个设置位表示为扩展掩码中的8位设置。
第 1 步:使用此扩展掩码从源数据中提取相关位
第 2 步:由于您需要将数据保持打包,因此我们将输出移动适当的位数。
代码如下:
; Step 0 : convert the 8 bit mask into a 64 bit mask
xor r8,r8
movzx rax,byte ptr mask_pattern
mov r9,rax ; save a copy of the mask - avoids a memory read in Step 2
mov rcx,8 ; size of mask in bit count
outer_loop :
shr al,1 ; get the least significant bit of the mask into CY
setnc dl ; set DL to 0 if CY=1, else 1
dec dl ; if mask lsb was 1, then DL is 1111, else it sets to 0000
shrd r8,rdx,8
loop outer_loop
; We get the mask duplicated in R8, except it now represents bytewise mask
; Step 1 : we extract the bits compressed to the lowest order bit
mov rax,qword ptr data_pattern
pext rax,rax,r8
; Now we do a right shift, as right aligned output is required
popcnt r9,r9 ; get the count of bits set in the mask
mov rcx,8
sub cl,r9b ; compute 8-(count of bits set to 1 in the mask)
shl cl,3 ; convert the count of bits to count of bytes
shl rax,cl
;The required data is in RAX
相信这会有所帮助