c++ - 用于匹配子表达式的正则表达式

Question

当我使用正则表达式时

std::regex midiNoteNameRegex("([cdefgab])([b#]{0,1})([0-9]))|([0-9]{3})|([A-Z0-9]{2})");

有三个由“|”连接的顶级子表达式在哪个模式将匹配。有没有办法分辨是哪一个？除了一个接一个地依次测试它们之外？

如果我使用命名子表达式会很容易，但是 C++ 中没有命名子表达式。

我该如何解决这个问题？

score 2 · Accepted Answer

给定正则表达式中的组，它只是对匹配对象的平面搜索，
在 C++ 中是一个标志（int）检查，没有明显的开销。

    ( [cdefgab] )                 # (1)
    ( [b#]{0,1} )                 # (2)
    ( [0-9] )                     # (3)
 |  ( [0-9]{3} )                  # (4)
 |  ( [A-Z0-9]{2} )               # (5)

和一个可能的用法

wregex MyRx = wregex( "([cdefgab])([b#]{0,1})([0-9])|([0-9]{3})|([A-Z0-9]{2})", 0);

wstring::const_iterator start = str.begin();
wstring::const_iterator end   = str.end();
wsmatch m;

while ( regex_search( start, end, m, MyRx ) )
{
    if ( m[1].matched )       
        // First alternation
    else
    if ( m[4].matched )       
        // Second alternation
    else
    if ( m[5].matched )       
        // Third alternation
    start = m[0].second;
}

score 0 · Accepted Answer

我没有明确的答案，但我相信答案很可能是否定的。

命名捕获组不是必需的功能：http ://www.cplusplus.com/reference/regex/ECMAScript/

命名捕获组的实现可能并不简单，并且可能会降低正则表达式引擎的性能。

在这个问题上找到了另一个与我一致的帖子：C++ 正则表达式：哪个组匹配？

c++ - 用于匹配子表达式的正则表达式

2 回答 2

Related

Reference