c++ - 如何编写正则表达式以匹配 ((11 3) (96 15) (2 3) )

Question

我正在尝试制作一个匹配的正则表达式：

((11 3) (96 15) (2 3) )

到目前为止，我有：

([^(|^)| |[A-Za-z])+

但它只捕获 11 而不是其余的。此外，字符串要长得多，我只使用了一小部分，所以它以相同的格式重复，但数字不同。到目前为止，这至少是我对该程序的一部分：

regex expression("([^(|^)| |[A-Za-z])+");
string line2 = "((11 3) (96 15) (2 3) )";
if(regex_match(line2, expression))
    cout << "yes";
else
    cout << "no";

score 6 · Accepted Answer

您的示例字符串中有数字，但在您的正则表达式中使用字母，这是故意的吗？我想我会使用这样的正则表达式：

\((\([0-9]+ [0-9]+\) )+\)

如果我们分解它，这是我的思考过程：

\(     // start with a literal "("
(      // create a group
\(     // another literal "("
[0-9]+ // one or more digits
       // a space (hard to spell out here
[0-9]+ // one or more digits
       // a space (hard to spell out here
\)     // a litteral ")" to match the opening
)      // close the group
+      // group must repeat one or more times
\)     // final closing ")"

编辑：好的，既然你说有时第二个数字不是数字，那么我们可以轻松地将正则表达式调整为如下所示：

\((\([0-9]+ [A-Za-z0-9]+\) )+\)

如果你需要避免混合字母和数字，你可以这样做：

\((\[0-9]+ ([A-Za-z]+|[0-9]+)\) )+\)

score 2 · Accepted Answer

让我们“从头开始”构建您的表达方式。

记住你的最终目标是 match ((11 3) (96 15) (2 3) )，我们将从匹配一个更简单的模式开始，然后一步一步地前进：

\d        matches "1"
\d+       matches "11", or "3", or "96"
\d+ *\d+  matches "11 3" or "96 15"
\(\d+ *\d+\)           matches "(11 3)" or "(96 15)"
(\(\d+ *\d+\) *)*      matches "(11 3)(96 15) (2 3)"
\((\(\d+ *\d+\) *)*\)  matches "((11 3) (96 15) (2 3) )"

注意：我没有测试过这个答案。我依靠Boost.Regex 文档来开发这个答案。

score 0 · Accepted Answer

我最近在尝试匹配类似于1-4,5,9,20-25. 诚然，生成的正则表达式并不简单：

/\G([0-9]++)(?:-([0-9]++))?+(?:,(?=[-0-9,]+$))?+/

这个表达式允许我逐步收集字符串中的所有匹配项。

我们可以对您的问题应用相同的方法，但要验证和匹配您的给定输入非常困难。（我不知道该怎么做。如果其他人这样做，我想看看！）但是您可以单独验证输入：

/\(\s*(\s*((\s*\d+\s+\d+\s*)\)\s*)+\s*\)/

请参阅Evan 的答案以了解其工作原理。\d等价于[0-9]和\s等价于[\r\n\t ]。

这是提取数字的增量匹配：

/\G\(?\s*(?:\(\s*(\d+)\s+(\d+)\s*\))(?:(?=\s*\(\s*\d+\s+\d+\s*\))|\s*\))/

它像这样分解：

/\G     # matches incrementally. \G marks the beginning of the string or the beginning of the next match.
 \(?\s* # matches first open paren; safely ignores it and following whiespace if this is not the first match.
 (?:    # begins a new grouping - it does not save matches.
   \(\s* # first subgroup open paren and optional whitespace.
   (\d+) # matches first number in the pair and stores it in a match variable.
   \s+   # separating whitespace
   (\d+) # matches second number in the pair and stores it in a match variable.
   \s*\) # ending whitespace and close paren
 )      # ends subgroup
 (?:    # new subgroup
   (?=  # positive lookahead - this is optional and checks that subsequent matches will work.
     \s*\(\s*\d+\s+\d+\s*\)  # look familiar?
   )    # end positive lookahead
   |    # if positive lookahead fails, try another match
   \s*\)\s* # optional ending whitespace, close paren
 )/     # ... and end subgroup.

我还没有测试过这个，但我相信它会起作用。每次将表达式应用于给定字符串时，它都会提取每个后续的数字对，直到它看到最后一个右括号，它会消耗整个字符串，或者如果出现输入错误则停止。您可能需要对 Boost::regex 进行优化。这是一个与 Perl 兼容的正则表达式。

c++ - 如何编写正则表达式以匹配 ((11 3) (96 15) (2 3) )

3 回答 3

Related

Reference