我想出了这个解决方案,它不是一种单行的解决方案,但它是这样的:假设你有一个名为 的变量中的 HTML 文本foo
,那么你可以执行以下操作:
string replacement1 = "\"";
string replacement2 = "</span>";
string pattern = @"(?<=<span style=\")[^\"]+"; //Will match all the style strings
string pattern1 = @"(?<=<span style=)(.|\s)+\"(?=>[^<>].+</span>)"; //Will match from the first " to the last " before HELLO
string pattern2 = @"(</span>\s*)+"; //Will match any number of </span> tags
Regex rgx = new Regex(pattern);
MatchCollection matches = rgx.Matches(foo);
foreach (Match match in matches)
replacement1 += match.Value + ";"; //Builds the new styles string
replacement1 += "\"";
Regex rgx = new Regex(pattern1);
string result = rgx.Replace(foo, replacement1); //Replace the multiple span style tags with a single one
Regex rgx = new Regex(pattern2);
string result = rgx.Replace(foo, replacement2); //Replace the multiple closing span tags with a single one
第一次更换后,你应该得到
<p>
<strong>
<span style="font-family:arial,sans-serif;color:black;font-size:medium">HELLO</span>
</span>
</span>
</strong>
</p>
在第二次替换之后:
<p>
<strong>
<span style="font-family:arial,sans-serif;color:black;font-size:medium">HELLO</span>
</strong>
</p>
我无法测试它(它可能有一些拼写错误),但它应该可以工作!