c - 无堆栈 VM 实现会出现哪些 C 集成问题？

Question

我所说的无堆栈VM是指在堆上维护自己的堆栈而不是使用系统“C-stack”的实现。这有很多优点，如延续和可序列化状态，但在涉及 C 绑定时也有一些缺点，尤其是 C-VM-C 类型的回调（或 VM-C-VM）。

问题是这些缺点到底是什么？任何人都可以举一个真正问题的好例子吗？

score 5 · Accepted Answer

听起来您已经熟悉了一些缺点和优点。

其他一些：a）即使底层实现没有任何支持，也可以支持适当的尾调用优化 b）更容易构造诸如语言级别“堆栈跟踪”之类的东西 c）更容易添加适当的延续，因为你指出

我最近用 C# 编写了一个简单的“Scheme”解释器，它最初使用 .NET 堆栈。然后我重写了它以使用显式堆栈 - 所以以下内容可能会对您有所帮助：

第一个版本使用隐式 .NET 运行时堆栈...

最初，它只是一个类层次结构，不同的形式（Lambda、Let 等）是以下接口的实现：

// A "form" is an expression that can be evaluted with
// respect to an environment
// e.g.
// "(* x 3)"
// "x"
// "3"
public interface IForm
{
    object Evaluate(IEnvironment environment);
}

IEnvironment 看起来如您所料：

/// <summary>
/// Fundamental interface for resolving "symbols" subject to scoping.
/// </summary>
public interface IEnvironment
{
    object Lookup(string name);
    IEnvironment Extend(string name, object value);
}

为了向我的 Scheme 解释器添加“内置”，我最初有以下界面：

/// <summary>
/// A function is either a builtin function (i.e. implemented directly in CSharp)
/// or something that's been created by the Lambda form.
/// </summary>
public interface IFunction
{
    object Invoke(object[] args);
}

那是它使用隐式 .NET 运行时堆栈的时候。代码肯定更少，但不可能添加诸如适当的尾递归之类的东西，最重要的是，我的解释器在运行时错误的情况下能够提供“语言级别”堆栈跟踪是很尴尬的。

所以我将它重写为有一个显式的（堆分配的）堆栈。

我的“IFunction”接口必须更改为以下内容，这样我才能实现“map”和“apply”之类的东西，它们会回调到 Scheme 解释器中：

/// <summary>
/// A function that wishes to use the thread state to
/// evaluate its arguments. The function should either:
/// a) Push tasks on to threadState.Pending which, when evaluated, will
///   result in the result being placed on to threadState.Results
/// b) Push its result directly on to threadState.Results
/// </summary>
public interface IStackFunction
{
    void Evaluate(IThreadState threadState, object[] args);
}

IForm 更改为：

public interface IForm
{
    void Evaluate(IEnvironment environment, IThreadState s);
}

其中IThreadState如下：

/// <summary>
/// The state of the interpreter.
/// The implementation of a task which takes some arguments,
/// call them "x" and "y", and which returns an argument "z",
/// should follow the following protocol:
/// a) Call "PopResult" to get x and y
/// b) Either
///   i) push "z" directly onto IThreadState using PushResult OR
///   ii) push a "task" on to the stack which will result in "z" being
///       pushed on to the result stack.
/// 
/// Note that ii) is "recursive" in its definition - that is, a task
/// that is pushed on to the task stack may in turn push other tasks
/// on the task stack which, when evaluated, 
/// ... ultimately will end up pushing the result via PushResult.
/// </summary>
public interface IThreadState
{
    void PushTask(ITask task);
    object PopResult();
    void PushResult(object result);
}

ITask 是：

public interface ITask
{
    void Execute(IThreadState s);
}

我的主要“事件”循环是：

ThreadState threadState = new ThreadState();
threadState.PushTask(null);
threadState.PushTask(new EvaluateForm(f, environment));
ITask next = null;

while ((next = threadState.PopTask()) != null)
    next.Execute(threadState);

return threadState.PopResult(); // Get what EvaluateForm evaluated to

EvaluateForm 只是一个在特定环境中调用 IForm.Evaluate 的任务。

就个人而言，我发现这个新版本从实现的角度来看更“好”——很容易获得堆栈跟踪，很容易让它实现完整的延续（虽然......我还没有这样做——需要使我的“堆栈”持久链表而不是使用 C# 堆栈，并且 ITask “返回”新的 ThreadState 而不是对其进行变异，以便我可以有一个“调用继续”任务）......等等等等。

基本上，您对底层语言实现的依赖较少。

关于我能找到的唯一缺点是性能......但就我而言，它只是一个解释器，所以我并不关心性能。

我还将向您指出这篇非常好的文章，该文章介绍了将递归代码重写为带有堆栈的迭代代码的好处，作者是 KAI C++ 编译器的作者之一：考虑递归

score 1 · Accepted Answer

在与 Steve Dekorte（Io 编程语言的作者）和 Konstantin Olenin 进行电子邮件交谈后，我发现了一个问题和一个（部分）解决方案。想象一下从 VM 到 C 函数的调用，它回调 VM 方法。在 VM 执行回调期间，VM 状态的一部分位于 VM 之外：在 C 堆栈和寄存器中。如果您此时保存 VM 状态，则可以保证下次加载 VM 时无法正确恢复状态。

解决方案是将 VM 建模为消息接收参与者：VM 可以向本机代码发送异步通知，而本机代码可以向 VM 发送异步通知。也就是说，在单线程环境中，当 VM 获得控制权时，不会在其外部存储任何其他状态（与 VM 运行时无关的数据除外）。

这并不意味着您可以在任何情况下正确恢复 VM 状态，但至少，您可以在其之上构建自己的可靠系统。

c - 无堆栈 VM 实现会出现哪些 C 集成问题？

2 回答 2

Related

Reference