javascript - 从 C/C++ 移植到 JavaScript 时如何处理负数或无符号字符？

Question

我正在尝试将旧的 C++ 词法分析器 ( source ) 移植到 JavaScript，并且由于我对 C/C++ 的不理解而有点挣扎。

我有一个参数c，正如我目前看到的那样，它可以是我正在解析的输入文件块上的位置索引 ( *yy_cp)，也可以是存储在该地址的实际（包括 nul）字符。我需要c用作查找表中的索引。词法分析器这样做：

/* Promotes a possibly negative, possibly signed char to an
 * unsigned integer for use as an array index.  If the signed char
 * is negative, we want to instead treat it as an 8-bit unsigned
 * char, hence the double cast.
 */
#define YY_SC_TO_UI(c) ((unsigned int) (unsigned char) c)

并这样称呼它：

register YY_CHAR yy_c = yy_ec[YY_SC_TO_UI(*yy_cp)];

它将存储yy_ec包含 256 个条目（我假设扩展 ASCII）的查找表的值，在yy_c. 查找的位置是由生成的YY_SC_TO_UI，这就是我将它移植到 JavaScript 时丢失的地方。YY_SC_TO_UI必须返回一个 0-255 之间的数字，所以我只取我所拥有的并且：

 "[c]".charCodeAt(0)

或者还有什么我需要注意在JS中处理“可能的负数，可能的签名字符”吗？

谢谢。

score 1 · Accepted Answer

取决于编译器char可以是signed或unsigned。大概作者希望它以相同的方式工作，并确保在从转换为时值始终为零扩展，而不是符号char扩展unsigned int。确保值为 0..255 而不是 -128..127 的安全方法。

根据 MDN，charCodeAt 的返回值范围更大：

charCodeAt() 方法返回一个介于 0 和 65535 之间的整数...

这取决于您的输入，您希望如何处理超出范围的可能值，但一种替代方法可能是简单的位掩码：

"€".charCodeAt(0) & 0xff;

javascript - 从 C/C++ 移植到 JavaScript 时如何处理负数或无符号字符？

1 回答 1

Related

Reference