computer-science - 创建最短的图灵完备解释器

Question

我刚刚尝试创建最小的语言解释器。你想加入并尝试吗？

游戏规则：

您应该指定您正在解释的编程语言。如果它是您发明的一种语言，它应该在评论中附带一个命令列表。
您的代码应该从分配给您的代码和数据变量的示例程序和数据开始。
您的代码应以结果输出结束。最好在每个中间步骤都有调试语句。
您的代码应该可以按照编写的方式运行。
您可以假设数据是 0 和 1（整数、字符串或布尔值，您可以选择）并且输出是单个位。
该语言应该是图灵完备的，因为对于在标准模型上编写的任何算法，例如图灵机、马尔可夫链或您选择的类似算法，如何编写执行后的程序是相当明显（或解释）的由您的解释器执行算法。
代码长度定义为去掉输入部分、输出部分、调试语句和不必要的空格后的代码长度。请将生成的代码及其长度添加到帖子中。
您不能使用使编译器为您执行代码的函数，例如eval()或exec()类似的。

这是一个社区 Wiki，这意味着问题和答案都不会从投票中获得声誉积分。但无论如何都要投票！

score 15 · Accepted Answer

Python，250 字节。

这个比 ilya n. 的 Python 解决方案要长一些，但是语言更容易掌握。这种语言中的每个命令（我称之为Kaputt，德语中的“破碎”）都是一个字节。定义了以下 6 个命令：

0：将零压入堆栈
1: 将一个 1 压入堆栈
I：（如果）从堆栈中弹出一个值（必须为零或一）。i如果它是一个，则运行以下代码块（直到“ ”）；如果它是零，则跳过该块。
i: (endif) 结束 if 块，如果块没有运行则压入 1（因为 " I" 弹出一个零）。有关后者的解释，请参见下文。
D: (def) 将要定义的函数的名称从堆栈中弹出（见下文）并将以下块（直到“ d”）绑定到该名称。
d: (enddef) 结束函数定义。
任何其他字节：检查此（一个字节；但请参阅下面的编辑）名称是否存在函数。如果是，运行这个函数；如果不是，则将此字节压入堆栈以供D.

嵌套 if 是合法的；嵌套函数定义不是。i当且仅当相应的 if 块未运行时 (endif) 推送一个这一事实允许以下类似于 if/else/endif 结构的习语：

# [code that left a one or zero on the stack]
I    # abbreviated "(" below
# [code to be run if it was a one]
0iI  # abbreviated "/" below
# [code to be run if it was a zero]
1iIi # abbreviated ")" below
# [continuing...]

请注意，实际上不允许使用注释、换行符、空格等。

以下是一些常用函数的示例。这些使用了( / )上面提到的缩写。

<D(/)d

定义了<从堆栈中弹出一个值而不将其用于任何操作的函数 (pop)。

&D((1/0)/<0)d

定义了函数&（and），它弹出堆栈的两个值，如果两个值都是 1，则压入 1，否则压入 0。

~D((11/10)/(01/00))d

定义~交换栈顶两个值的函数（swap）。

RD(R/<)d

定义函数R(remove) 递归地从堆栈顶部删除所有值，然后再删除两个值（其中顶部的值应该为零）。

以下解释器函数需要程序 p，它是一个字符串（或任何其他可迭代的字节），以及输入堆栈 S，它是一个（可能为空的）字节列表。解释器运行后，此列表包含输出堆栈。

def i(p,S,F=0):
 A=S.append
 F=F or{}
 C=0
 k=[]
 for c in p:
  h=0in k
  if c=="d":C=0
  elif C!=0:C+=[c]
  elif c=="I":k+=[int(h or S.pop())]
  elif c=="i":k.pop()or A("1")
  elif 1-h:
   if c=="D":C=F[S.pop()]=[]
   else:i(F.get(c)or A(c)or"",S,F)

显然，没有错误检查的余地，因此i()需要合法的 Kaputt代码。测试用例：

script = "<D(/)d"                 # < = pop
script += "&D((1/0)/<0)d"         # & = and
script += "~D((11/10)/(01/00))d"  # ~ = swap
script += "RD(R/<)d"              # R = remove
script += "|D(<1/(1/0))d"         # | = or
script=script.replace("(","I")   
script=script.replace("/","0iI")
script=script.replace(")","1iIi") # ( and / and ) are no real commands of the language.

S=[]
i(script+"1111010111111111R",S)
assert S == ["1","1","1","1","0"]

S=[]
i(script+"00&",S)
assert S == ["0"]

S=[]
i(script+"01~",S)
assert S == ["1","0"]

S=[]
i(script+"00|",S)
assert S == ["0"]

S=[]
i(script+"01|",S)
assert S == ["1"]

快乐编码:-)

编辑：解释器中没有任何东西可以强制标记仅为一个字节。要求这样做更多是为了与内置命令保持一致（这是一个字节，因为这会使解释器更短）。如果传递字符串列表而不是字符串，则可以编写更具可读性的 Kaputt代码，如下所示：

script = """
inc D
    (
        (
            0 0
        /
            1 0
        )
    /
        1
    )
d
""".replace("(","I").replace("/","0 i I").replace(")","1 i I i").split()

这定义了一个函数，该函数inc递增堆栈顶部的两位数（顶部的 LSB）。

测试：

for n in xrange(4):
    S=[]
    i(script + [str((n & 2)/2), str(n & 1), "inc"], S)
    assert S == [str(((n+1) & 2)/2), str((n+1) & 1)]

让我们将多字节函数名称称为语言扩展:-)

score 9 · Accepted Answer

9

#include "/dev/tty"

于 2009-08-12T03:41:38.907 回答

score 9 · Accepted Answer

解释我刚刚创建的语言的 Python 程序：

from random import randint
z = [randint(0,1), None]  # example: input data is 0 or 1
x = '_b_ed_a_i____d_ai'   # this program will set p = data[1] = not data[0]

# input x[i], data z[k]
# jumper j
# p is +1 or -1 each step
# commands:
#    a   set data = 0 or 1
#    b   j += 0 or +-9 depending on data
#    d   move data left or right
#    e   perform jump left or right
#    j   set jumper to 0
#    i   end program
# output: p or some data[x], both possible

g = globals()
p = i = 1
a = b = d = j = e = k = 0
while i:
    h = x[i]
    g[h] = p = -p
    i += 1 + j*e
    if a: z[k] = i % 2
    if z[k]: j += 9*b
    k += d
    g[h] = 0        
#    print(a, b, d, e, i, j, k, h, z)

print('Input:', z, 'Output:', p > 0)

优化版：

g=globals()
p=i=1
a=b=d=j=e=k=0
while i:
 h=x[i]
 g[h]=p=-p
 i+=1+j*e
 if a:z[k]=i%2
 if z[k]:j+=9*b
 k+=d
 g[h]=0

114 字节

更新：我想补充一点，我的程序的重点不是创建任何实用的语言，甚至不是以某种方式拥有“最佳”（==“最短”）解释器，而是展示有趣的编程技巧。例如，我依靠通过直接访问全局变量globals()，因此我什至从不测试j命令，节省了宝贵的字节:)

score 7 · Accepted Answer

使用A86组装下面的代码以获得 150 字节的 BF 解释器！

    add dh,10h
    push dx
    add dh,10h
    push dx
    mov bl,80h
    lea dx,[bx+2]
    add bl,byte ptr [bx]
    mov byte ptr [bx+1],al
    mov ah,3dh
    int 21h
    pop ds,es
    jc l14
    mov bx,ax
    mov ah,3fh  
    mov cx,di
    xor dx,dx
    int 21h
    jc l14
    mov bx,ax
    xor ax,ax
    rep stosw
    xor di,di
    xor si,si
    inc ch
l1:
    cmp si,bx
    jae l14
    lodsb
    mov bp,8
    push l1
l2: 
    dec bp
    js ret
    cmp al,[bp+l15]
    jne l2
l3:
    mov cl,[bp+8+l15]
    jmp cx
l4:
    inc di  
    ret
l5:
    dec di
    ret
l6:
    es inc byte ptr [di]
    ret
l7:
    es dec byte ptr [di]
    ret
l8:
    mov ah,2
    es mov dl,[di]
    int 21h
    ret
l9:
    mov ah,8
    int 21h
    es mov [di],al
    ret
l10:
    es cmp byte ptr [di],dh
    jne ret
l11:    
    cmp si,bx
    jae l14
    lodsb
    cmp al,']'
    jne l11
    ret
l12:
    es cmp byte ptr [di],dh
    je ret
l13:    
    dec si
    jz l14
    cmp byte ptr [si-1],'['
    jne l13
l14:
    ret
l15:
    db '>'
    db '<'
    db '+'
    db '-'
    db '.'
    db ','
    db '['
    db ']'
    db offset l4-100h
    db offset l5-100h
    db offset l6-100h
    db offset l7-100h
    db offset l8-100h
    db offset l9-100h
    db offset l10-100h
    db offset l12-100h

在命令行上指定一个文件名，不需要双引号，作为 BF 源。

score 6 · Accepted Answer

6

不是我的，但看看这里的 210位二进制 lambda 演算自我解释器。

于 2009-08-09T18:29:24.793 回答

score 4 · Accepted Answer

用Perl编写，140 个字符长，包括 shell 调用和标志：

perl -ne'push@a,split;if(eof){$i=0;while($i>=0){($a,$b,$c)=@a[$i..$i+2];$b>-1?$a[$b]-=$a[$a]:print chr$a[$a];$i=$b>-1&&$a[$b]<=0?$c:$i+3;}}'

可读版本：

#!/usr/bin/perl -n
push @prog, split /\s+/, $_;
if(eof) {
  $i = 0;
  while($i >= 0) {
    ($a, $b, $c) = @prog[$i .. $i + 2];
    if($b > -1) {
      $prog[$b] -= $prog[$a];
    } else {
      print chr $prog[$a];
    }
    if($b > -1 and $prog[$b] <= 0) {
      $i = $c;
    } else {
      $i + 3;
    }
  }
}

以上是使用该指令的单指令集计算机的伪机器代码解释器subleq。基本上，源代码应该只是一堆用空格分隔的数字。一个简单的测试程序来验证我的结果（应该打印“Hi”和一个 Unix 换行符）：

0 0 6 72 105 10 3 -1 9 4 -1 12 5 -1 15 0 0 -1

测试输入的可读版本（同样有效）：

0    0     6
72   105   10
3   -1     9
4   -1     12
5   -1     15
0    0    -1

score 4 · Accepted Answer

我自己的条目，在Ruby中实现 OISC。它有 85 个字节长，我相信可以通过一些巧妙的技巧再挤出一些。程序必须在代码中包含数据（以空格分隔的数字行）。目前我无法提供用 OISC 编写的工作程序，但我稍后会提供。

p,m=0,gets.chomp.split.map(:to_i)
while p>0
p=(m[m[b]]-=m[m[a]])>0?p+3:c
end
$><<m[0]

代码非常简单。m是包含程序和数据的“内存”。m第一行使用提供的代码和p- 内存指针进行初始化。主循环是subleq操作，用三元运算符编写。最后一行是输出内存中包含的数字的聪明方法。

score 2 · Accepted Answer

在我之前的代码高尔夫条目的基础上，这里有一个（稍微概括为 IO）OISC仿真器

Fortran77

模糊且没有装载脚手架（655 个字符）：

  subroutine o(n,m)
  integer m(n)              
  l=1;                      
  do while (l.ne.0)
     i=m(l)
     j=m(l+1)
     k=m(l+2)
     mi=mg(n,m,i)
     mj=mg(n,m,j)
     if(j.eq.(n+2)) then
        write(6,*)mj-mi
     else
        m(l+1)=mj-mi
     endif
     if (m(l+1).lt.0) then
        l=mg(n,m,k)
     else
        l=l+3
     endif
  end do
  return
  end
  function mg(n,m,i)
  integer m(n)
  if (i.eq.n+2) then
     read(5,*)mg
  elseif (i.eq.n+1) then
     mg=0
  else
     mg=m(i)
  endif
  return
  end

带有注释、加载脚手架等（2435 个字符）：

      program c
      parameter (n=1024)        ! The size to use for memeory
      integer m(n)              ! represent the memory
c     Load a program into memory
      i=1
 1    read(5,*,end=2)l
c     write (6,*) "Load ",l," into location ",i
      m(i)=l
      i=i+1
      goto 1
c     Run the computer
 2    call o(n,m)
      stop
      end

      subroutine o(n,m)
c     Simulate a simple computer that supports only a single
c     instruction. Memory is a fixed size array of integers.
c
c     The supported instruction is subtract-branch-negative which we
c     give the assembly syntax
c
c     sbn i j k
c
c     and acts by subtracting the value in memeory location i from that
c     in location j and storing the result in j. If the result is
c     negative, the PC is set to k. Because there is only one opcode, it
c     is not represented, so a program consists simply of a series of
c     triplet (i,j,k), and the PC is generally incremented by 3. The
c     program counter starts at 1
c
c     Povisions for IO and halting are by way of special memory
c     location:
c
c     * PC=0 means halt
c     * writes (opcode argument j) to memory location n+1 send
c       output, reads from n+1 evaluate as 0
c     * reads (opcode argument i, j, k) from memory location n+2 fetch
c       input, writes to n+2 are forbidden
c     * All others reads and writes outside of memeory are forbidden

c     n                         ! the size of the computers memory 
      integer m(n)              ! an array representing the computers memory
      l=1;                      ! program counter
      do while (l.ne.0)
         i=m(l)
         j=m(l+1)
         k=m(l+2)
c        write (6,*) "Executing from PC=",l,": (",i,", ",j,", ", k,")"
c        handle the write special case for output
         mi=mg(n,m,i)
         mj=mg(n,m,j)
         if(j.eq.(n+1)) then
            write(6,*)mj-mi
         else
c           write (6,*) "Setting location ",j," to ",mj-mi
            m(l+1)=mj-mi
         endif
         if (m(l+1).lt.0) then
            l=mg(n,m,k)
         else
            l=l+3
         endif
      end do
      return
      end

c     Handle the various read special cases
      function mg(n,m,i)
      integer m(n)
      if (i.eq.n+2) then
         read(5,*)mg
      elseif (i.eq.n+1) then
         mg=0
      else
         mg=m(i)
      endif
c     write (6,*) "Read ",mg," from location ",i
      return
      end

示例程序：

导致输出：

$ cat trivial.oisc | ./a.out 
           1
          -1

这是预期的。

它确实是非常简单的代码。这里的重点不是我可以编写多紧密的代码，而是图灵完整性所需的语言有多简单。

score 1 · Accepted Answer

106 字节的解决方案已发布到codegolf.com比赛。它用 perl 编写并解释Brainfuck。在这一点上没有办法审查它，afaik。

这不是我的解决方案，但我相信它比当前条目短。可能的解决方案应该更短，因为不需要使用 Brainfuck 作为图灵完备的语言。

score 0 · Accepted Answer

CoffeeScript中的 URM解释器， 143 个字节（167 个新行）。

此版本的 URM 包含无限数量的寄存器，包括零、后继和跳转运算符。众所周知，这是图灵完备的。

URM 程序写在数组c （命令）中，输入在数组r（寄存器）中。计算后输出为 atr[0]并显示该值。

解释器，带有示例程序和计算 32+13（实际上输出 45）的输入：

# Addition program, similar to http://planetmath.org/examplesofunlimitedregistermachines
c = [['j', 1, 2, 4], ['s', 0], ['s', 2], ['j', 1, 1, 0]]

# Input: 32, 13, thus the desired output is: 45
r = [32, 13]

k = 0
while k < c.length
    t = c[k]
    k += 1
    if t[0] == 'z'
        r[t[1]] = 0
    if t[0] == 's'
        if !r[t[1]]?
            r[t[1]] = 0
        r[t[1]] += 1
    if t[0] == 'j' && r[t[1]] == r[t[2]]
        k = t[3]
alert r[0]

缩小版：

k=0
while k<c.length
 t=c[k]
 k+=1
 if t[0]=='z'
  r[t[1]]=0
 if t[0]=='s'
  if !r[t[1]]?
   r[t[1]]=0
  r[t[1]]+=1
 if t[0]=='j'&&r[t[1]]==r[t[2]]
  k=t[3]
alert r[0]

我真正喜欢这段代码的是它非常简单易懂。

score -1 · Accepted Answer

自定义语言：SLA
关键字：
i - 解释 SLB q - 退出

自定义语言：SLB
关键字：
cp("text") - 将文本解释为 C 程序

示例：
cp("#include \nint main() { puts(\"Hi!\n\");return 0}")

口译员（以 SLA 编写）： i q

3个字！

computer-science - 创建最短的图灵完备解释器

11 回答 11

Fortran77

Related

Reference