我目前正在为 16 位定制微处理器编写软件。我需要能够处理大于 16 位的数字。指令集非常有限,但有左移、右移、AND、OR 和 NOT。我将如何实现这种类型的算术?
2 回答
对于加法和减法,您确实将溢出从一个词带到另一个词。对于乘法,http ://en.wikipedia.org/wiki/Binary_multiplier中描述的技术相当容易实现,并且比串行加法更好。除法通常更复杂,但一个很好的起点是http://en.wikipedia.org/wiki/Division_algorithm#Integer_division_.28unsigned.29_with_remainder。
该除法算法实际上只是一个整数“长除法”。不是特别快,但很准确。这篇文章有更多涉及的方法,但它们……嗯,涉及更多。可能先做一个简单的,然后看看你是否需要优化它。
另请查看http://en.wikipedia.org/wiki/Multiplication_algorithm。http://x86asm.net/articles/working-with-big-numbers-using-x86-instructions/有一些关于 x86 处理器上的多字算术的很好的信息,其中大部分应该适用于其他 CPU。
我已经很久没有玩过这些东西了。在 Z80 上完成了 32 位算术(很多很多年前),我知道你在这里遇到了什么。
如果有人想要代码来执行此操作,我现在正在编写它。截至上周,我有 64 位乘法运算,所以现在我正在研究 64 位除法运算。该代码将在https://github.com/drtonyr上或给我发送电子邮件。联系我是最好的方法,因为下面的代码是非标准的(我打算编写/构建所有东西,包括焊接晶体管),但如果它有帮助,那么它看起来像这样(在类似 Forth 的语言中):
: 4mul pick(7) pick(4) # grab the sign words
xor mvDC # compute final sign and put on call stack
4abs 4swap 4abs 4maxmin # abs swap so that second is smallest
callocC(4) # zero space on call stack for result
allocC(4) # space for smallest multiplier
storeC(1) storeC(2) storeC(3) storeC(4) # smallest to C in reverse order
C add(8) D sub(2) # set up pointers for main _4mul_word calcs
mvCD _4mul_word _4mul_word_asl # mul X0
mvCD _4mul_word _4mul_word_asl # mul X1
mvCD _4mul_word _4mul_word_asl # mul X2
mvCD _4mul_word # mul X3
ndrop(7) # don't need X3, pointers or shift space anymore
4mvCD # move result to data stack
mvCD <0 if? 4negate # set correct sign
; # and exit happy
其中大部分是数据移动,但想法是得到一个 64 位的结果和最大的数字,每个周期向上移动一个,然后挑选最小数字的位,如果该位被设置,则添加到累加器中,这是通过 _4mul_word 实现的(_4mul_word_asl 只是在最高位为零时完成移位)。有一天我会进一步优化,因为最小的数字可能在顶部有很多零,一旦你处理了最小数字的最高位,整个事情就会停止。_4mul_word 在下面,但这里的主要内容是(a)你可以做到,(b)它需要做很多工作并且运行缓慢。
_4mul_word: # top of stack contains the top address of accumulator (D),
# the top address of partial shifted input (A) and
# the word to multiply (B).
# This runs over all non-zero bits.
*C = E, C -= I # store E
*C = D, C -= I # store D - we will leave stack state unchanged
B = *D, D -= I # pop the main word we are going to work with
A = *D, D -= I # Address of partially shifted
D = *D, D -= Z # Address of accumulator
_4mul_word_loop:
Z = B & I # test a single bit
eq0 P = *P, P += I # skip if bit is zero
:_4mul_word_bitzero
*C = A, C -= I # store A as we are going to change it
*C = D, C -= I # store D as we are going to change it
*C = B, C -= Z # store B as we are going to change it
## add in the shifted value (A) to accumulator (D)
B = *D, D -= Z # read X0
E = *A, A -= I # read Y0
B = B + E # compute Z0
*D = B, D -= I # and store it
B = *D, D -= Z # read X1
E = *A, A -= I # read Y1
cin B = B + E # compute Z1
*D = B, D -= I # and store it
B = *D, D -= Z # read X2
E = *A, A -= I # read Y2
cin B = B + E # compute Z2
*D = B, D -= I # and store it
*D = B, D -= I # and store it
B = *D, D -= Z # read X3
E = *A, A -= I # read Y3
cin B = B + E # compute Z3
*D = B, D -= I # and store it
B = *C, C += I # restore B
D = *C, C += I # restore D
A = *C, C += Z # restore A
_4mul_word_bitzero:
## shift up the contents of A
*C = A, C -= Z # store A as we are going to change it
E = *A, A -= Z # read X0
E = E + E # compute Z0
*A = E, A -= I # and store it
E = *A, A -= Z # read X1
cin E = E + E # compute Z1
*A = E, A -= I # and store it
E = *A, A -= Z # read X2
cin E = E + E # compute Z2
*A = E, A -= I # and store it
E = *A, A -= Z # read X3
cin E = E + E # compute Z3
*A = E, A -= I # and store it
A = *C, C += Z # restore A
B = B >> I # shift down B
ne0 P = *P, P += I # loop if we haven't yet shifted down B to zero
:_4mul_word_loop
C = C + I
D = *C, C += I # restore D
E = *C, C += Z # restore E
P = *E, E += I # execute the NEXT code