我从电视列表页面中获取了以下 HTML 代码:
<div class="channel_row">
<span class="channel">
<div class="logo"><img src ="/images/channel_logos/WGNAMER.png" /></div>
<p><strong>2</strong><br />
WGNAMER
</p>
</span>
<span class="time" style="width:0.0px;padding:0;height:42px;">
<div style="margin:10px">
<a class="thickbox" style="" href="/tv/info/?program_id=49909&height=260&width=612" title="WGN News at Nine">WGN News at Nine</a>
<p class="schedule_flags"><strong class="new_flag">New</strong>, <strong class="cc_flag">CC</strong>, <strong class="stereo_flag">Stereo</strong></p>
</div>
</span>
<span class="time" style="width:245.6px;padding:0;height:42px;">
<div style="margin:10px">
<a class="thickbox" style="" href="/tv/info/?program_id=49910&height=260&width=612" title="America's Funniest Home Videos">America's Funniest Home Videos</a>
<p class="schedule_flags"><strong class="cc_flag">CC</strong>, <strong class="stereo_flag">Stereo</strong></p>
</div>
</span>
</div>
它只是一遍又一遍地循环使用channel_row ......
现在我已经在HtmlAgilityPack的帮助下设置了一些 VB 代码,希望有一种快速简便的方法来遍历所有这些类并获取徽标图像、电视频道、电台名称、更多节目描述和节目标题的 HREF
所以在上面的例子中,解析看起来像:
/images/channel_logos/WGNAMER.png
2
WGNAMER
/tv/info/?program_id=49909&height=260&width=612
WGN News at Nine
/tv/info/?program_id=49910&height=260&width=612
America's Funniest Home Videos
我的VB代码是:
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim htmlString As String = "<div class=""channel_row"">" & _
"<span class=""channel"">" & _
"<div class=""logo""><img src =""/images/channel_logos/WELF.png"" /></div>" & _
"<p><strong>13</strong><br />" & _
"WELF" & _
"</p>" & _
"</span>" & _
"<span class=""time"" style=""width:245.6px;padding:0;height:42px;"">" & _
"<div style=""margin:10px"">" & _
"<a class=""thickbox"" style="""" href=""/tv/info/?program_id=35424&height=260&width=612"" title=""Praise the Lord"">Praise the Lord</a>" & _
"<p class=""schedule_flags""><strong class=""cc_flag"">CC</strong></p>" & _
"</div>" & _
"</span>" & _
"<span class=""time"" style=""width:122.8px;padding:0;height:42px;"">" & _
"<div style=""margin:10px"">" & _
"<a class=""thickbox"" style="""" href=""/tv/info/?program_id=35425&height=260&width=612"" title=""ACLJ This Week"">ACLJ This Week</a> " & _
"<p class=""schedule_flags""><strong class=""cc_flag"">CC</strong></p>" & _
"</div>" & _
"</span>" & _
"<span class=""time"" style=""width:122.8px;padding:0;height:42px;"">" & _
"<div style=""margin:10px"">" & _
"<a class=""thickbox"" style="""" href=""/tv/info/?program_id=35426&height=260&width=612"" title=""Full Flame"">Full Flame</a> " & _
"<p class=""schedule_flags""><strong class=""cc_flag"">CC</strong></p>" & _
"</div>" & _
"</span>" & _
"<span class=""time"" style=""width:0.0px;padding:0;height:42px;"">" & _
"<div style=""margin:10px"">" & _
"<a class=""thickbox"" style="""" href=""/tv/info/?program_id=35427&height=260&width=612"" title=""Secrets: Kim Clement"">Secrets: Kim Clement</a> " & _
"<p class=""schedule_flags""></p>" & _
"</div>" & _
"</span>" & _
"</div>"
Dim doc = New HtmlAgilityPack.HtmlDocument()
Dim htmlDocument As IHTMLDocument2 = New HTMLDocumentClass()
htmlDocument.write(htmlString)
htmlDocument.close()
doc.LoadHtml(String.Format(htmlString))
Dim res = doc.DocumentNode.SelectNodes("//div[@class='channel_row']")
For Each item In res
Dim firstDiv = item.SelectSingleNode(".//div[@class='channel']")
Dim content1 = firstDiv.ChildNodes(0).InnerText.Trim()
Dim content2 = firstDiv.ChildNodes(1).InnerText.Trim()
Dim content4 = item.SelectSingleNode(".//div[@class='myclass2']")
Next
End Sub
目前错误在线Dim content1 = firstDiv.ChildNodes(0).InnerText.Trim()说:
你调用的对象是空的。
任何帮助都会很棒!
更新
使用最新的代码建议:
Dim doc = New HtmlAgilityPack.HtmlDocument()
doc.LoadHtml(htmlString)
Dim all = new Dictionary(of String, Object)()
For Each channel In doc.DocumentNode.SelectNodes(".//div[@class='channel_row']")
Dim info = new Dictionary(of String, Object)()
With channel
info!Logo = .SelectSingleNode(".//img").Attributes("src").Value
info!Channel = .SelectSingleNode(".//span[@class='channel']").ChildNodes(1).ChildNodes(0).InnerText
info!Station = .SelectSingleNode(".//span[@class='channel']").ChildNodes(1).ChildNodes(2).InnerText
info!Shows = From tag In .SelectNodes(".//a[@class='thickbox']")
Select New With {.Show = tag.Attributes("title").Value, .Link = tag.Attributes("href").Value}
End With
all.Add(info!Station, info)
Next
all.Dump()
有3个错误:
1) 在线选择 New With {.Show = Tag.Attributes("title").Value, .Link = Tag.Attributes("href").Value}
错误是:“选择案例”必须以匹配的“结束选择”结尾。
2) 在线all.Add(info!Station, info)
错误是:语句和标签在“选择案例”和第一个“案例”之间无效。
3) 上线all.Dump()
错误是:“转储”不是“System.Collections.Generic.Dictionary(Of String, Object)”的成员。