compiler-construction - Bootstrapping a compiler: why?

Question

I understand how a language can bootstrap itself, but I haven't been able to find much reference on why you should consider bootstrapping.

The intuitive answer is that the language you're writing offers utilities that are not found in the "base" language of the compiler, and the language's features are relatively well-suited for a compiler.

For instance, it would make sense to bootstrap a C++ compiler -- it could potentially be much easier to maintain the compiler when OOP is properly used, as opposed to using plain C.

On the other hand, MATLAB certainly makes matrix math a lot easier than plain C, but I can't see any apparent benefits from writing a MATLAB compiler/interpreter in MATLAB -- it seems like it would become less maintainable. A similar view could be applied to the R programming language. Or a pretty extreme example would be bootstrapping Whitespace, which is written in Haskell -- definitely a massive superset of Whitespace.

Is the only reason for bootstrapping to take advantage of the new language's features? I know there's also the "because we can" reason, but that's not what I'm looking for :)

score 36 · Accepted Answer

There's a principle called "eating your own dogfood". By using a tool, you demonstrate the usefulness of the tool.

It is often asked, "if the compiler for language X isn't written in language X, why should I risk using it?"

This of course only applies to languages suitable for the domain of compiler writing.

score 17 · Accepted Answer

There are two main advantages to bootstrapped language implementations: first, as you suggest, to take advantages of the high-level features of said language in the implementation. However, a less-obvious but no less important advantage is that it lets you customize and extend the language without dropping into a lower layer written in C (or Java, or whatever sits below the new language runtime).

Metaprogramming may not be useful for most day-to-day tasks, but there are times where it can save you a lot of duplicated or boilerplate code. Being able to hook into the compiler and core runtime for a language at a high level can make advanced metaprogramming tasks much easier.

score 11 · Accepted Answer

Ken Thompson关于信任信任的思考解释了自举的最佳理由之一。从本质上讲，您的编译器会为引导链中的每个编译器版本学习新的东西，您将永远不必再教它。

在他提到的情况下，必须明确告知您编写的第一个编译器（C1）如何处理反斜杠转义。但是，第二个编译器 (C2) 是使用 C1 编译的，因此反斜杠转义处理是本机处理的。

他演讲的基石是您可以教编译器为程序添加后门，并且未来使用受感染的编译器编译的编译器也将具有这种能力，并且它永远不会出现在源代码中！

本质上，您的程序可以在每个编译周期学习新功能，而不必在以后的编译周期中重新实现或重新编译，因为您的编译器已经了解它们的所有信息。

花点时间了解后果。

[编辑]：这是构建编译器的非常糟糕的方式，但很酷的因素是天翻地覆。我想知道它是否可以通过正确的框架进行管理？

score 10 · Accepted Answer

It can be considered the bar that separates "toy" languages from "real" languages. If the language isn't rich enough to implement itself, it's still a toy. But this is probably an attitude from a largely bygone era, given the number of popular languages today that are implemented in C.

score 8 · Accepted Answer

One advantage would be that developers working on the compiler would only need to know the language being compiled. Otherwise developers would need to know the language being compiled as well as the language the compiler is written in.

score 3 · Accepted Answer

编译器解决了各种各样的重要问题，包括字符串操作、处理大型数据结构以及与操作系统的接口。如果您的语言旨在处理这些事情，那么用您的语言编写编译器可以展示这些功能。此外，它会产生指数效应，因为随着您的语言包含更多功能，您可以在编译器中使用更多功能。如果您实现了任何可以简化编译器编写的独特功能，那么您就可以使用这些新工具来实现更多功能。

但是，如果您的语言不打算处理与编译相同的问题，那么引导只会诱使您使用与编译相关但与您的目标问题无关的功能来混淆您的语言。使用 Matlab 或 SQL 进行自编译将是荒谬的；Matlab 没有理由包含强大的字符串操作函数，SQL 没有理由支持代码生成。由此产生的语言将是不必要的和混乱的。

还值得注意的是，解释语言是一个稍微不同的问题，应该相应地对待。

score 3 · Accepted Answer

低级语言通常是自举的，因为为了将代码放在新系统上，你需要一个低级编译器。获取一个 C 编译器，现在您可以使用大量代码。拥有自举编译器使这更容易，您只需要存在自己的代码即可编译和改进自己的代码。

还有其他方法可以实现这一点，例如制作交叉编译器，在大多数系统上，您在日常使用中永远不需要能够在设备本身上编译静态语言（事实上，像 Windows 这样的系统没有编译器）。

编译器经常引导的另一个原因是，他们不必担心编译器中的错误。确保您的编译器可以自行编译，并限制在使用其他编译器编译时可能出现的错误组合。

我认为引导高级语言主要是为了炫耀一个人的毛茸茸的编程技巧。

score 2 · Accepted Answer

您不会为 DSL 引导编译器。您无需在 SQL 中编写 SQL 查询编译器。MATLAB 可能看起来像一种通用语言，但实际上并非如此——它是一种专为数值计算而设计的语言。

score 2 · Accepted Answer

Bootstrapping 还有另一个优点：如果你的语言很好，你可以通过用 <在此处插入语言> 编写编译器来节省时间，而不是用 C 编写。例如，C# 编译器是用 C++ 编写的，但现在他们正在用 C++ 重写它C#，它允许他们（除其他外）使用 CLR 中的线程框架，而不是在 C++ 中滚动他们自己的框架（并且在营销方面也跟随 Mono 家伙的领导，Mono 处于更好的位置，因为能够说我们的 C# 编译器实际上是用 C# 编写的）。

score 2 · Accepted Answer

作为一个具体的例子，在1.5 版（2015 年 8 月发布）中，Go 切换为完全自举的语言^{[1] [2]}。他们列出了以下原因：

Go 比 C 更容易（正确地）编写。
Go 比 C 更容易调试（即使没有调试器）。
Go 是你唯一需要知道的语言；鼓励贡献。
Go 具有更好的模块化、工具、测试、分析...
Go 使并行执行变得微不足道。

其中，唯一适用于所有语言的是，您只需要了解一种语言即可为编译器做出贡献。其他论点可以概括为“我们的新语言比旧语言更好”。这可能是真的，否则你为什么要写一门新语言？

score 1 · Accepted Answer

您可能想要这样做的原因有几个（理论上）：

您的编译器生成的代码比引导平台上的其他编译器更优化。
您的编译器生成的代码比引导平台上的其他编译器更正确。
你是一个自负的混蛋，即使不是这样，也确信上述之一是正确的。
您的平台上没有可用的编译器（这是 GCC 的原始逻辑，因为当时许多平台没有 C 编译器）。
你想证明你的编译器可以处理它（毕竟，这实际上是一个很好的编译器测试）。

score 0 · Accepted Answer

0

Bootstrapping 是通用编程语言所期望的。

于 2022-02-07T05:27:58.953 回答

compiler-construction - Bootstrapping a compiler: why?

12 回答 12

Related

Reference