2

Please consider the following input string:

X=Y
Z=U
Q=P

Lorem Ipsum is simply dummy text of the printing and typesetting industry.
Lorem Ipsum has been the industry's standard dummy text ever since the 1500s

I'm wondering if it's possible capture the following with a regex one liner:

left: X
right: Y
left: Z
right: U
left: Q
right: P

text: Lorem Ipsum is simply dummy text of the printing and typesetting
industry. Lorem Ipsum has been the industry's standard dummy text
ever since the 1500s

The idea is that there's a bunch of lines that have a specific format followed by a "\r\n" and some text after that. I want to capture each of the key value pairs (in this example) and the text separately.

Capturing the structured data is easy enough (and just an example here):

(?:^(?<left>\S+)=(?<right>\S)\n)

But I cannot figure out how to specify something like:

"Keep capturing this pattern until the first empty line, after that take everything and capture it to "text".

It's easy enough to solve this problem using code, but I'm really interested in learning if it's even possible with nothing but a Regex one liner.

4

1 回答 1

2

是的,在 .NET(并且只有在那里)中,您可以重复捕获组,并从每次重复中获取捕获:

^               # anchor pattern to the beginning of the string
(?:             # non-capturing group for a single x=y line
  (?<left>\S+)  # match and capture left-hand side
  =
  (?<right>\S+) # match and capture right-hand side
  \n
)+              # repeat
\n              
(?<text>.*)     # match the remainder of the string
$               # anchor pattern to the end of the string (not really necessary)

确保使用RegexOptions.IgnorePatternWhitespaceRegexOptions.Singleline

如果您的Match对象被调用m,那么您现在可以检索:

m.Groups["left"].Captures  // for a list of all left-hand sides
m.Groups["right"].Captures // for a list of all right-hand sides
m.Groups["text"].Value     // for the remainder of the string
于 2013-08-15T09:39:01.883 回答