3

假设您有一个机器指令 udive,它通过取(32 位被除数 << 32)/32 位除数来执行特殊情况 64 x 32 无符号除法,我们可以使用以下方法进行完整的 64 x 32 除法:

// assume: a / b guaranteed not to overflow
a = 64bit dividend, a.h & a.l are hi & lo 32bits respectively
b = 32bit divisor

q1 = udive(a.h, b)  // (a.h << 32) / b
r1 = -(q1 * b)      // remainder of the above, shortcut since a.h & 0xffffffff == 0
q2 = a.l / b        // a.l / b using regular unsigned division
r2 = a.l - (q2 * b) // remainder of the above
q = q1 + q2
r = r1 + r2

// r < r2, r overflowed and is >32bits, implies r > b since b is 32bits
// r >= b, quotient too small by 1, adjust
if (r < r2) or (r >= b)
    q = q + 1
return q

然而,签署的案例给我带来了问题。假设一个等效的 sdive 指令执行 udive 的签名版本,我无法完全弄清楚如何处理余数等等。

4

3 回答 3

0

thanks for answering. I've looked at Hacker's Delight. If you look at the divdi3 function in HD there's a call to DIVS, the 64/32->32 instruction, when the divisor is 32-bit and the result is known not to overflow. The machine I'm on doesn't have a general purpose 64/32->32 instruction, it has the special case mentioned above. The above function for 64/32->32 is my implementation for DIVU in the unsigned case, so I'm trying to work out something similar for DIVS.

I could just forget about the DIVS path, but that's the overwhelmingly common case and I want to hit it as much as possible, it's just a tricky implementation.

Sorry for the unclear pseudo code, but everything is unsigned and mostly 32bit.

DIVU(uint64 a, uint32 b) {
// assume: a / b guaranteed not to overflow
// a = 64bit dividend, a.h & a.l are hi & lo 32bits respectively
// b = 32bit divisor

uint32 q1 = udive(a.h, b)      // (a.h << 32) / b
uint32 r1 = -(q1 * b)          // remainder of the above, shortcut for (a.h << 32) - (q1 * b) since (a.h << 32) & 0xffffffff == 0
uint32 q2 = a.l / b            // a.l / b using regular unsigned division
uint32 r2 = a.l - (q2 * b)     // remainder of the above
uint32 q = q1 + q2
uint32 r = r1 + r2

// r < r2, r overflowed and is >32bits, implies r > b since b is 32bits
// r >= b, quotient too small by 1, adjust
if (r < r2) or (r >= b)
        q = q + 1
return q
}
于 2009-07-07T01:29:15.790 回答
0

可以忽略divdi3调用div的特殊优化案例;我想提请注意的是,当 divdi3 需要进行全强度除法时,它通过调用 udivdi3 而不是通过等效于 udivdi3 算法的有符号除法来实现。

在 Knuth vol 2 中,我看到他也基本上说你通过取绝对值的过程进行有符号除法,进行无符号除法,然后修复符号位。这对我来说很直观,因为有符号的 2s 补码没有 a == (ah * 2^32) + al 无符号数所具有的便利属性,因此通过操作来组装 64 位操作并不容易两半分开。

前后摆弄不应该是无符号除法上的那么多额外循环......

PS:这到底是什么奇怪的CPU?:-)

于 2009-07-07T18:47:49.433 回答
0

我认为,如果明确说明哪些变量是 32 位、哪些是 64 位以及比较是有符号还是无符号,那么您的无符号代码会更容易阅读。

这本书Hacker's Delight通常适合这种低级算术的东西。目前我手头没有副本,但它的代码在 64/32->32 的情况下执行 64/64->64 在线:http://www.hackersdelight.org/HDcode/newCode/divDouble。 C

通过简单地获取输入的绝对值,进行无符号除法,然后如果输入具有不同的符号,则翻转结果位的符号来完成有符号的情况。这向我表明这可能是最好的方法(证明正确肯定比替代方法更容易)。如果它不只是在洗涤中掉出来,您可能需要将股息作为最小可能整数的特殊情况。

于 2009-07-06T22:26:34.390 回答