我正在做一个程序,此时我需要让它高效。我正在使用 Haswell 微架构(64 位)和“g++”。目标是使用一条ADC
指令,直到循环结束。
//I removed every carry handlers from this preview, yo be more simple
size_t anum = ap[i], bnum = bp[i];
unsigned carry;
// The Carry flag is set here with an common addtion
anum += bnum;
cnum[0]= anum;
carry = check_Carry(anum, bnum);
for (int i=1; i<n; i++){
anum = ap[i];
bnum = bp[i];
//I want to remove this line and insert the __asm__ block
anum += (bnum + carry);
carry = check_Carry(anum, bnum);
//This block is not working
__asm__(
"movq -64(%rbp), %rcx;"
"adcq %rdx, %rcx;"
"movq %rsi, -88(%rbp);"
);
cnum[i] = anum;
}
该CF
集合仅在第一次添加中吗?还是每次我做ADC
指令时都这样?
我认为问题在于每次循环完成时都会丢失。如果是这个问题,我该如何解决?CF