1

我正在尝试获取我们论坛上 div 标签之间的所有内容以在程序中处理它们,获取的页面如下所示:

<div id="post_message_1234567">

        <a href="http://blahblah.com" target="_blank"><img src="http://blahblah.com/iuhiuhuh.gif" border="0" alt="" /></a> <br />
<br />
jofjhoeifjoiwefjoweifj<br />
 blahblahblahpokpoekpfowef<br />
<br />
khfiudhfisduhfiusdfh<br />
<br />
<a href="http://blah.com/img.php?image=trepazoid.jpg" target="_blank"><img src="http://blah.com/loc367/euhfwieufhwifuhiwefuh.jpg" border="0" alt="" /></a><br />
<br />
one<br />
 two*three<br />
 87879879 nuts<br />
 11 bananas<br />
<br />
<a href="hjoiwjhfoweif.dat" target="_blank">Monkeys</a>
        </div>

我尝试使用此正则表达式代码,但没有帮助:

string find = "\\b<div id=\"post_message_\\d+\">\\n*.*</div>\\b";

你能帮我得到和之间的一切<div id="post_message_1234567"></div>

4

1 回答 1

1

这个怎么样:

@"<div id=""post_message_\d+"">(?<Content>(\r|\n|.)*)</div>"

例子:

string searchString = @"<div id=""post_message_1234567"">

        <a href=""http://blahblah.com"" target=""_blank""><img src=""http://blahblah.com/iuhiuhuh.gif"" border=""0"" alt="""" /></a> <br />
<br />
jofjhoeifjoiwefjoweifj<br />
 blahblahblahpokpoekpfowef<br />
<br />
khfiudhfisduhfiusdfh<br />
<br />
<a href=""http://blah.com/img.php?image=trepazoid.jpg"" target=""_blank""><img src=""http://blah.com/loc367/euhfwieufhwifuhiwefuh.jpg"" border=""0"" alt="""" /></a><br />
<br />
one<br />
 two*three<br />
 87879879 nuts<br />
 11 bananas<br />
<br />
<a href=""hjoiwjhfoweif.dat"" target=""_blank"">Monkeys</a>
        </div>";
Regex regex = new Regex(@"<div id=""post_message_\d+"">(?<Content>(\r|\n|.)*)</div>");
Match match = regex.Match(searchString);
bool success = match.Success; // True
string content = match.Groups["Content"].Value;

content现在包含您想要的标签之间的所有内容。

于 2013-11-08T19:28:33.937 回答