我一直在使用此演示文稿中的示例(幻灯片 41)。
就我而言,它执行 alpha 混合。
MOVQ mm0, alpha//4 16-b zero-padding α
MOVD mm1, A //move 4 pixels of image A
MOVD mm2, B //move 4 pixels of image B
PXOR mm3 mm3 //clear mm3 to all zeroes
//unpack 4 pixels to 4 words
PUNPCKLBW mm1, mm3 // Because B -A could be
PUNPCKLBW mm2, mm3 // negative, need 16 bits
PSUBW mm1, mm2 //(B-A)
PMULHW mm1, mm0 //(B-A)*fade/256
PADDW mm1, mm2 //(B-A)*fade + B
//pack four words back to four bytes
PACKUSWB mm1, mm3
我想用汇编程序用c重写它。
现在,我有这样的事情:
void fade_mmx(SDL_Surface* im1,SDL_Surface* im2,Uint8 alpha, SDL_Surface* imOut)
{
int pixelsCount = imOut->w * im1->h;
Uint32 *A = (Uint32*) im1->pixels;
Uint32 *B = (Uint32*) im2->pixels;
Uint32 *out = (Uint32*) imOut->pixels;
Uint32 *end = out + pixelsCount;
__asm__ __volatile__ (
"\n\t movd (%0), %%mm0"
"\n\t movd (%1), %%mm1"
"\n\t movd (%2), %%mm2"
"\n\t pxor %%mm3, %%mm3"
"\n\t punpcklbw %%mm3, %%mm1"
"\n\t punpcklbw %%mm3, %%mm2"
"\n\t psubw %%mm2, %%mm1"
"\n\t pmulhw %%mm0, %%mm1"
"\n\t paddw %%mm2, %%mm1"
"\n\t packuswb %%mm3, %%mm1"
: : "r" (alpha), "r" (A), "r" (B), "r" (out), "r" (end)
);
__asm__("emms" : : );
}
编译时我收到此消息:Error: (%dl) is not a valid base/index expression
关于汇编程序中的第一行。我怀疑这是因为alpha
is Uint8
,我尝试投射它,但后来出现分段错误。在这个例子中,他们谈论4 16-b zero-padding α
的是我不太清楚的。