c# - C#使用正则表达式提取字符串

Question

我有一个正在解析的 html 字符串，如下所示。我需要获取@Footer 的值。

strHTML = "<html><html>\r\n\r\n<head>\r\n<meta http-equiv=Content-Type 
           content=\"text/html; charset=windows-1252\">\r\n
           <meta name=Generator content=\"Microsoft Word 14></head></head><body> 
           <p>@Footer=CONFIDENTIAL<p></body></html>"

我已经尝试了下面的代码，我如何获得价值？

Regex m = new Regex("@Footer", RegexOptions.Compiled);
foreach (Match VariableMatch in m.Matches(strHTML.ToString()))
{
     Console.WriteLine(VariableMatch);
}

score 2 · Accepted Answer

您需要在=. 只要值不能包含任何<字符，这将起作用：

Regex m = new Regex("@Footer=([^<]+)", RegexOptions.Compiled);
foreach (Match VariableMatch in m.Matches(strHTML.ToString()))
{
    Console.WriteLine(VariableMatch.Groups[1].Value);
}

score 2 · Accepted Answer

您可以使用正则表达式执行此操作，但这不是必需的。一种简单的方法是：

var match = strHTML.Split(new string[] { "@Footer=" }, StringSplitOptions.None).Last();
match = match.Substring(0, match.IndexOf("<"));

这假设您的 html 字符串只有一个@Footer.

score 1 · Accepted Answer

您的正则表达式将匹配字符串“@Footer”。匹配的值将是“@Footer”。

您的正则表达式应如下所示：

Regex regex = new Regex("@Footer=[\w]+");
string value = match.Value.Split('=')[1];

score 1 · Accepted Answer

1

使用匹配组。

Regex.Matches(strHTML, @"@Footer=(?<VAL>([^<\n\r]+))").Groups["VAL"].Value;

于 2013-08-26T17:02:51.250 回答

score 0 · Accepted Answer

如果这就是您的全部字符串，我们可以使用字符串方法来解决它，而无需触及正则表达式：

var result = strHTML.Split(new string[]{"@Footer=", "<p>"}, StringSplitOptions.RemoveEmptyEntries)[1]

c# - C#使用正则表达式提取字符串

5 回答 5

Related

Reference