javascript - String#match() 捕获组的奇怪行为

Question

问题：我有一个字符串，例如："to see to be to read"并且我想捕获不带“to”前缀的 3 个动词，在本例中为be：see和read。

正则表达式：/to (\w+)/g
结果： ['be', 'see', 'read']

只是出于好奇，我已经使用正面的前瞻来制作另一个 regex，结果是一样的。

正则表达式：/(?=to \w+)\w+ (\w+)/g
结果： ['be', 'see', 'read']

好的。奇怪的是：当我在浏览器控制台（Chrome 或 Firefox）上运行这个正则表达式时，结果是不同的。以下两次尝试给了我相同的结果：所有三个组都包括前缀to。

> 'to be to see to read'.match(/to (\w+)/g)
  ["to be", "to see", "to read"]

> 'to be to see to read'.match(/(?=to \w+)\w+ (\w+)/g)
  ["to be", "to see", "to read"]

我在这里遗漏了什么还是我踩到了一个错误？

免责声明：这不是家庭作业，我只是在验证一个更大的问题。我不是正则表达式专家，但知道一两件事。

编辑：我想我被 Regex101 愚弄了。它给我的代码示例显示了该String#match()方法，但此函数不会在结果组中相应地排除正则表达式组。循环RegExp#exec()比赛是要走的路！

score 1 · Accepted Answer

在 Javascript 中捕获组的正确方法是RegExp#exec在 while 循环中使用方法：

var re = /to (\w+)/g,
    matches = [],
    input = "to see to be to read";
while (match = re.exec(input))
   matches.push(match[1]);

console.log(matches);
//=> ["see", "be", "read"]

javascript - String#match() 捕获组的奇怪行为

1 回答 1

Related

Reference