好的,我有一个多行字符串,我正在尝试对其进行一些清理。
每行可能是也可能不是一大块引用文本的一部分。例子:
This line is not quoted.
This part of the line is not quoted “but this is.”
This one is not quoted either.
“This entire line is quoted”
Not quoted.
“This line is quoted
and so is this one
and so is this one.”
This is not quoted “but this is
and so is this.”
我需要一个正则表达式替换,它将解开硬包装的引号行,即用空格替换“\r\n”,但只能在大引号之间。
以下是更换后的外观:
This line is not quoted.
This part of the line is not quoted “but this is.”
This one is not quoted either.
“This entire line is quoted”
Not quoted.
“This line is quoted and so is this one and so is this one.”
This is not quoted “but this is and so is this.”
(注意最后两行是输入文本中的多行。)
约束
- 理想情况下需要一个正则表达式替换调用
- 使用 .NET RegEx 库
- 引号始终是开始/结束的大引号,而不是普通的双引号 ("),这应该会使这更容易一些。
重要约束
这不是直接的 .NET 代码,我正在填充一个“searchfor/replacewith”字符串表,然后通过 RegEx.Replace 调用这些字符串。我无法添加自定义代码,如匹配评估器、循环捕获的组等。
到目前为止,当前的答案大致如下:
r.Replace("(?<=“)\r\n(?=”)", " ")
显然,我什至还没有接近。
相同的逻辑可以应用于编程代码中块注释的颜色编码——块注释内的任何内容都不会与注释外的内容相同。(代码有点棘手,因为开始/结束块注释分隔符也可以合法地存在于文字字符串中,我不必在这里处理这个问题。)