1

I'm trying to construct a regular expression to treat delimited speech marks (\") as a single character.

The following code compiles fine, but terminates on trying to initialise rgx, throwing the error Abort trap: 6 using libc++.

std::regex rgx("[[.\\\\\".]]");
std::smatch results;
std::string test_str("\\\"");
std::regex_search(test_str, results, rgx);

If I remove the [[. .]], it runs fine, results[0] returning \" as intended, but as said, I'd like for this sequence to be usable as a character class.

Edit: Ok, I realise now that my previous understanding of collated sequences was incorrect, and the reason it wouldn't work is that \\\\\" is not defined as a sequence. So my new question: is it possible to define collated sequences?

4

1 回答 1

1

所以我想出了我哪里出错了,我想我会把它留在这里,以防有人偶然发现它。

您可以使用 指定一组被动字符(?:sequence),从而允许将量词应用于字符类。也许不完全是我最初要求的,但至少在我的情况下,实现了相同的目的。

为了匹配以双引号开头和结尾的字符串(包括结果中的这些字符),但允许在字符串中使用分隔引号,我使用了表达式

\"(?:[^\"^\\\\]+|(?:\\\\\\\\)+|\\\\\")*\"

它说要抓住尽可能多的字符,前提是字符不是引号或反斜杠,然后如果不匹配,首先尝试匹配偶数个反斜杠(以允许分隔该字符),或者其次是分隔引号。这个非捕获组被匹配尽可能多的次数,仅在到达 a 时停止\"

我无法评论它的效率,但它确实有效。

于 2013-01-06T20:07:02.970 回答