math - 64 位算术，16 位字长（跨 4 个字存储 64 位数字）

Question

我目前正在为 16 位定制微处理器编写软件。我需要能够处理大于 16 位的数字。指令集非常有限，但有左移、右移、AND、OR 和 NOT。我将如何实现这种类型的算术？

score 2 · Accepted Answer

对于加法和减法，您确实将溢出从一个词带到另一个词。对于乘法，http ://en.wikipedia.org/wiki/Binary_multiplier中描述的技术相当容易实现，并且比串行加法更好。除法通常更复杂，但一个很好的起点是http://en.wikipedia.org/wiki/Division_algorithm#Integer_division_.28unsigned.29_with_remainder。

该除法算法实际上只是一个整数“长除法”。不是特别快，但很准确。这篇文章有更多涉及的方法，但它们……嗯，涉及更多。可能先做一个简单的，然后看看你是否需要优化它。

另请查看http://en.wikipedia.org/wiki/Multiplication_algorithm。http://x86asm.net/articles/working-with-big-numbers-using-x86-instructions/有一些关于 x86 处理器上的多字算术的很好的信息，其中大部分应该适用于其他 CPU。

我已经很久没有玩过这些东西了。在 Z80 上完成了 32 位算术（很多很多年前），我知道你在这里遇到了什么。

score 0 · Accepted Answer

如果有人想要代码来执行此操作，我现在正在编写它。截至上周，我有 64 位乘法运算，所以现在我正在研究 64 位除法运算。该代码将在https://github.com/drtonyr上或给我发送电子邮件。联系我是最好的方法，因为下面的代码是非标准的（我打算编写/构建所有东西，包括焊接晶体管），但如果它有帮助，那么它看起来像这样（在类似 Forth 的语言中）：

: 4mul pick(7) pick(4)        # grab the sign words
  xor mvDC                    # compute final sign and put on call stack
  4abs 4swap 4abs 4maxmin     # abs swap so that second is smallest
  callocC(4)                  # zero space on call stack for result
  allocC(4)                   # space for smallest multiplier
  storeC(1) storeC(2) storeC(3) storeC(4) # smallest to C in reverse order
  C add(8) D sub(2)              # set up pointers for main _4mul_word calcs
  mvCD _4mul_word _4mul_word_asl # mul X0
  mvCD _4mul_word _4mul_word_asl # mul X1
  mvCD _4mul_word _4mul_word_asl # mul X2
  mvCD _4mul_word                # mul X3
  ndrop(7)                    # don't need X3, pointers or shift space anymore
  4mvCD                       # move result to data stack
  mvCD <0 if? 4negate         # set correct sign
  ;                           # and exit happy

其中大部分是数据移动，但想法是得到一个 64 位的结果和最大的数字，每个周期向上移动一个，然后挑选最小数字的位，如果该位被设置，则添加到累加器中，这是通过 _4mul_word 实现的（_4mul_word_asl 只是在最高位为零时完成移位）。有一天我会进一步优化，因为最小的数字可能在顶部有很多零，一旦你处理了最小数字的最高位，整个事情就会停止。_4mul_word 在下面，但这里的主要内容是（a）你可以做到，（b）它需要做很多工作并且运行缓慢。

_4mul_word:  # top of stack contains the top address of accumulator (D),
             # the top address of partial shifted input (A) and
             # the word to multiply (B).
             # This runs over all non-zero bits.
  *C = E, C -= I  # store E
  *C = D, C -= I  # store D - we will leave stack state unchanged

  B = *D, D -= I  # pop the main word we are going to work with
  A = *D, D -= I  # Address of partially shifted
  D = *D, D -= Z  # Address of accumulator

  _4mul_word_loop:
    Z = B & I         # test a single bit
    eq0 P = *P, P += I  # skip if bit is zero
    :_4mul_word_bitzero
      *C = A, C -= I  # store A as we are going to change it
      *C = D, C -= I  # store D as we are going to change it
      *C = B, C -= Z  # store B as we are going to change it

                      ## add in the shifted value (A) to accumulator (D)
      B = *D, D -= Z  # read X0
      E = *A, A -= I  # read Y0
      B = B + E       # compute Z0
      *D = B, D -= I  # and store it
      B = *D, D -= Z  # read X1
      E = *A, A -= I  # read Y1
      cin B = B + E   # compute Z1
      *D = B, D -= I  # and store it
      B = *D, D -= Z  # read X2
      E = *A, A -= I  # read Y2
      cin B = B + E   # compute Z2
      *D = B, D -= I  # and store it
      *D = B, D -= I  # and store it
      B = *D, D -= Z  # read X3
      E = *A, A -= I  # read Y3
      cin B = B + E   # compute Z3
      *D = B, D -= I  # and store it

      B = *C, C += I  # restore B
      D = *C, C += I  # restore D
      A = *C, C += Z  # restore A
    _4mul_word_bitzero:

                    ## shift up the contents of A
    *C = A, C -= Z  # store A as we are going to change it
    E = *A, A -= Z  # read X0
    E = E + E       # compute Z0
    *A = E, A -= I  # and store it
    E = *A, A -= Z  # read X1
    cin E = E + E   # compute Z1
    *A = E, A -= I  # and store it
    E = *A, A -= Z  # read X2
    cin E = E + E   # compute Z2
    *A = E, A -= I  # and store it
    E = *A, A -= Z  # read X3
    cin E = E + E   # compute Z3
    *A = E, A -= I  # and store it
    A = *C, C += Z  # restore A

    B = B >> I      # shift down B
    ne0 P = *P, P += I  # loop if we haven't yet shifted down B to zero
    :_4mul_word_loop

  C = C + I
  D = *C, C += I  # restore D
  E = *C, C += Z  # restore E
  P = *E, E += I  # execute the NEXT code

math - 64 位算术，16 位字长（跨 4 个字存储 64 位数字）

2 回答 2

Related

Reference