c - 八进制/十六进制符号从何而来？

Question

这么久了，我从来没想过要问这个问题；我知道这来自 c++，但它背后的原因是什么：

像往常一样指定十进制数
用前导 0 指定八进制数
用前导 0x 指定十六进制数

为什么是0？为什么是 0x？base-32 有自然的进展吗？

score 22 · Accepted Answer

C, the ancestor of C++ and Java, was originally developed by Dennis Richie on PDP-8s in the early 70s. Those machines had a 12-bit address space, so pointers (addresses) were 12 bits long and most conveniently represented in code by four 3-bit octal digits (first addressable word would be 0000octal, last addressable word 7777octal).

Octal does not map well to 8 bit bytes because each octal digit represents three bits, so there will always be excess bits representable in the octal notation. An all-TRUE-bits byte (1111 1111) is 377 in octal, but FF in hex.

Hex is easier for most people to convert to and from binary in their heads, since binary numbers are usually expressed in blocks of eight (because that's the size of a byte) and eight is exactly two Hex digits, but Hex notation would have been clunky and misleading in Dennis' time (implying the ability to address 16 bits). Programmers need to think in binary when working with hardware (for which each bit typically represents a physical wire) and when working with bit-wise logic (for which each bit has a programmer-defined meaning).

I imagine Dennis added the 0 prefix as the simplest possible variation on everyday decimal numbers, and easiest for those early parsers to distinguish.

I believe Hex notation 0x__ was added to C slightly later. The compiler parse tree to distinguish 1-9 (first digit of a decimal constant), 0 (first [insignificant] digit of an octal constant), and 0x (indicating a hex constant to follow in subsequent digits) from each other is considerably more complicated than just using a leading 0 as the indicator to switch from parsing subsequent digits as octal rather than decimal.

Why did Dennis design this way? Contemporary programmers don't appreciate that those early computers were often controlled by toggling instructions to the CPU by physically flipping switches on the CPUs front panel, or with a punch card or paper tape; all environments where saving a few steps or instructions represented savings of significant manual labor. Also, memory was limited and expensive, so saving even a few instructions had a high value.

In summary: 0 for octal because it was efficiently parseable and octal was user-friendly on PDP-8s (at least for address manipulation)

0x for hex probably because it was a natural and backward-compatible extension on the octal prefix standard and still relatively efficient to parse.

score 9 · Accepted Answer

八进制的零前缀和十六进制的 0x 来自 Unix 的早期。

八进制存在的原因可以追溯到硬件具有 6 位字节的时候，这使得八进制成为自然的选择。每个八进制数字代表 3 位，因此 6 位字节是两个八进制数字。十六进制也是如此，从 8 位字节开始，其中一个十六进制数字是 4 位，因此一个字节是两个十六进制数字。对 8 位字节使用八进制需要 3 个八进制数字，其中第一个数字只能有值 0、1、2 和 3（第一个数字实际上是“四进制”，而不是八进制）。除非有人开发了一个字节长度为 10 位的系统，否则没有理由使用 base32，因此一个 10 位字节可以表示为两个 5 位“nybbles”。

score 5 · Accepted Answer

“新”数字必须以数字开头，才能使用现有语法。

已建立的实践具有以字母（或一些其他符号，可能是下划线或美元符号）开头的变量名称和其他标识符。所以“a”、“abc”和“a04”都是名字。数字以数字开头。所以“3”和“3e5”是数字。

当你在编程语言中添加新东西时，你试图让它们适应现有的语法、语法和语义，并试图让现有的代码继续工作。因此，您不希望更改语法以使“x34”成为十六进制数或“o34”成为八进制数。

那么，如何将八进制数字放入此语法中？有人意识到，除了“0”之外，不需要以“0”开头的数字。没有人需要为 123 写“0123”。所以我们使用前导零来表示八进制数字。

十六进制数字呢？您可以使用后缀，因此“34x”表示 34 ₁₆。然而，解析器在知道如何解释数字之前必须一直读到数字的末尾（除非它遇到“a”到“f”数字之一，这当然表示十六进制）。解析器“更容易”知道数字是早期的十六进制。但是你仍然必须从一个数字开始，并且已经使用了零技巧，所以我们需要其他东西。选择了“x”，现在我们有十六进制的“0x”。

（以上内容基于我对解析的理解和一些关于语言开发的一般历史，而不是基于对编译器开发人员或语言委员会做出的具体决定的了解。）

score 4 · Accepted Answer

我不知道 ...

0 代表 0ctal

0x 是，嗯，我们已经用 0 来表示八进制，并且有一个十六进制的 x，所以在那里也有

至于自然进展，最好看看最新的可以附加下标的编程语言，例如

123_27（将_解释为下标）

等等

?

标记

score 2 · Accepted Answer

base-32 有自然的进展吗？

这就是为什么 Ada 使用 16# 形式来引入十六进制常量、8# 表示八进制、2# 表示二进制等的部分原因。

不过，我不会太担心自己在基地中需要“未来增长”的空间。这不像 RAM 或寻址空间，每一代都需要一个数量级。

事实上，研究表明八进制和十六进制几乎是二进制兼容的人类可读表示的最佳选择。如果你低于八进制，它开始需要大量的数字来表示更大的数字。如果你高于十六进制，数学表会变得非常大。十六进制实际上已经有点太多了，但是八进制的问题是它不能均匀地放入一个字节中。

score 1 · Accepted Answer

Base32有一个标准编码。它与Base64非常相似。但是阅读起来不是很方便。使用十六进制是因为 2 个十六进制数字可用于表示 1 个 8 位字节。八进制主要用于使用12 位字节的旧系统。与将原始寄存器显示为二进制相比，它可以更紧凑地表示数据。

还应注意，某些语言使用 o### 表示八进制，使用 x## 或 h## 表示十六进制，以及许多其他变体。

score -1 · Accepted Answer

我认为它0x实际上是为 UNIX/Linux 世界而来的，并被 C/C++ 和其他语言所采用。但我不知道确切的原因或真正的起源。

c - 八进制/十六进制符号从何而来？

7 回答 7

Related

Reference