在这里有点大脑冻结,所以我希望得到一些指示,基本上我需要提取特定 div 标签的内容,是的,我知道正则表达式通常不被批准,但它是一个简单的网络抓取应用程序,其中没有嵌套的div。
我正在尝试匹配这个:
<div class="entry">
<span class="title">Some company</span>
<span class="description">
<strong>Address: </strong>Some address
<br /><strong>Telephone: </strong> 01908 12345
</span>
</div>
简单的vb代码如下:
Dim myMatches As MatchCollection
Dim myRegex As New Regex("<div.*?class=""entry"".*?>.*</div>", RegexOptions.Singleline)
Dim wc As New WebClient
Dim html As String = wc.DownloadString("http://somewebaddress.com")
RichTextBox1.Text = html
myMatches = myRegex.Matches(html)
MsgBox(html)
'Search for all the words in a string
Dim successfulMatch As Match
For Each successfulMatch In myMatches
MsgBox(successfulMatch.Groups(1).ToString)
Next
任何帮助将不胜感激。