regex - 正则表达式匹配具有转义字符的字符串

Question

我想编写一个可以匹配以下字符串文字规范的正则表达式。在过去的 10 个小时里，我疯狂地编写了各种似乎都不起作用的正则表达式。最后我归结为这个：

([^"]|(\\[.\n]))*\"

基本上，要求如下：

必须匹配字符串文字，所以我匹配所有内容直到最后一个“，中间可能有一个 \”，它不应该结束字符串。
我们还可以转义任何内容，包括带有 '\' 的 \n
只有未转义的 '"' 字符可以结束匹配，没有别的。

我需要正确匹配的一些示例字符串如下：

\a\b\"\n" => 我应该匹配后面的字符 '\', 'a', '\', 'b', '\','"','\', 'n', '" '
\"this is still inside the string" => 应该匹配整个文本，包括最后一个 '"'
'm about to escape to a newline \'\n'" => 这个字符串中有一个 \n 字符，但字符串仍然应该匹配从开始 'm' 到结束 '"' 的所有内容。

请有人帮我制定这样的正则表达式。在我看来，我提供的正则表达式应该可以完成这项工作，但它无缘无故地失败了。

score 2 · Accepted Answer

您的正则表达式几乎是正确的，您只需要注意在字符类中，句.点只是一个文字.，而不是除 newline 之外的任何字符。所以：

([^"\\]|\\(.|\n))*\"

或者：

([^"\\]|\\[\s\S])*\"

score 1 · Accepted Answer

1

我认为这会更有效：

[^"\\]*(\\.[^"\\]*)*\"

于 2012-05-15T19:30:00.260 回答

score 0 · Accepted Answer

我假设您的字符串也以 " 开头（您的示例不应该以它开头吗？）

Lookaround 结构对我来说似乎是最自然的使用：

".*?"(?<!\\")

给定输入

"test" test2 "test \a test"  "test \"test" "test\""

这将匹配：

"test"
"test \a test"
"test \"test"
"test\""

正则表达式如下：

Match the character “"” literally «"»
Match any single character that is not a line break character «.*?»
   Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “"” literally «"»
Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind) «(?<!\\")»
   Match the character “\” literally «\\»
   Match the character “"” literally «"»

regex - 正则表达式匹配具有转义字符的字符串

3 回答 3

Related

Reference