c# - 使用多行和组的正则表达式

Question

嗨，伙计们刚刚有一个关于在正则表达式中使用多行的快速问题：

正则表达式：

 string content = Regex.Match(onix.Substring(startIndex,endIndex - startIndex), @">(.+)<", RegexOptions.Multiline).Groups[1].Value;

这是我正在阅读的文本字符串：

    <Title>
         <TitleType>01</TitleType>
         <TitleText textcase="02">18th Century Embroidery Techniques</TitleText>
    </Title>

这是我得到的：

我想要的是两者之间的一切

 <Title> and </Title>.

当所有内容都在一行上时，这非常有效，但由于从另一行开始，它似乎正在跳过它或不将其包含在模式中。

非常感谢任何帮助。

score 4 · Accepted Answer

您还必须使用 Singleline 选项以及 Multiline：

string content = Regex.Match(onix.Substring(startIndex,endIndex - startIndex), @">(.+)<", RegexOptions.Multiline | RegexOptions.Singleline).Groups[1].Value;

但是帮自己一个忙，停止使用正则表达式解析 XML！请改用 XML 解析器！

您可以使用XmlDocument类解析 XML 文本，并使用XPath 选择器获取您感兴趣的元素：

XmlDocument doc = new XmlDocument();
doc.LoadXml(...);                              // your load the Xml text 

XmlNode root = doc.SelectSingleNode("Title");  // this selects the <Title>..</Title> element
                                               // modify the selector depending on your outer XML 
Console.WriteLine(root.InnerXml);              // displays the contents of the selected node

score 2 · Accepted Answer

RegexOptions.Multiline只会将^and的含义更改$为行的开头/结尾，而不是整个字符串的开头/结尾。

您想RegexOptions.Singleline改用，这将导致.匹配换行符（以及其他所有内容）。

score 0 · Accepted Answer

您可能想要解析可能是 XML 的内容。如果可能，这是首选的工作方式，而不是通过使用正则表达式来解析它。如不适用请忽略。

c# - 使用多行和组的正则表达式

3 回答 3

Related

Reference