virtual - 细粒度沙盒

Question

场景：在 Java 或 Python 等字节码虚拟机中运行的程序想要评估（通过即时编译为字节码然后运行）其代码是从外部自动生成或提供的函数。

棘手的一点是函数的代码是不可信的——它可能是由遗传编程等随机方法生成的，甚至是由对手提供的。因此，希望强制它表现为纯函数——它可以返回一个值，但它可能没有任何副作用，即它可能不会以任何方式改变程序的任何现有数据。

另一个棘手的问题是该函数可能有合法需要调用程序的某些现有函数。其中一些函数可能有副作用，但只要它们被可疑函数调用，就应该防止它们实际上产生任何持久的影响。

此外，最好不要对可疑函数的编码风格施加任何限制，例如，它可以自由地对它自己创建的任何数据结构执行破坏性更新，只需要它的整体效果是纯函数式的。

此外，解决方案最好具有相当低的开销，因为这可能需要进行数十亿次；例如，最好避免为每个这样的功能派生一个全新的虚拟机。

这不一定必须在 Java 或 Python 等现有虚拟机中才能实现；如果有必要围绕这个用例设计一个虚拟机，那就这样吧。

对于这个问题，是否有任何已知的解决方案（或非解决方案，即已知不起作用的东西）？

score 1 · Accepted Answer

Well, the general problem seems to be unsolveable: there isn't a terminating strategy for sorting inherently stateful computations from the possibly state-free. Unless the byte code is especially constructed to provide the kind strong type constraints required to overcome this, then you'll be lost. Curien has written much about what kind of things can and can't be inferred from black box observations.

But if you are willing to demand more from your function provider, the question is begging for proof-carrying code (PCC) as an answer. I'm guessing you are aware of the work of Necula, who was particularly concerned with ensuring that assembly code respected constraints on memory usage, such as not tampering with out-of-scope state; you might not be aware of the work done on automatic inference of proofs in fairly common cases: it may be the case that PCC is an easier option than you think.

score 1 · Accepted Answer

我和许多其他人以前为基因编程目的构建了语言。如果构建一种新语言是一种选择，那么这样的解决方案已经存在。由于有自动生成函数的技术，因此提供函数库应该是微不足道的。该系统实际上将构成一个沙盒环境。这些函数的任何副作用都将限于程序可访问的空间。

score 1 · Accepted Answer

我想沙盒是你唯一的选择。试图分析程序并确定它是否安全相当于一个停止问题。CLR 具有内置的安全性，允许这样的限制，我想 Java 也有类似的限制。我认为 Python 不会。

virtual - 细粒度沙盒

3 回答 3

Related

Reference