0

我正在尝试从 Excel 中获取姓名和邮政编码列表,将它们依次输入一个姓名和邮政编码到 www.yellowpages.com 的搜索字段中,然后以与原始相同的顺序将街道地址结果返回到 Excel姓名和邮政编码。没有返回错误消息,它只是停止而没有完成。我不确定它在哪里停止,但它确实打开了 Internet Explorer,输入搜索词并单击搜索,因为当 .visible = True 时我可以看到。我最好的猜测是在“”之间。

这是我的代码(改编自DontFretBrettDinesh Kumar Takyar):

Sub Address_Scrape()
    Dim eRow As Long
    Dim ele As Object
    Dim wb As Workbook
    Dim srch As Worksheet
    Dim trgt As Worksheet
    Set wb = ThisWorkbook
    Set srch = wb.Sheets("Master with addresses")
    Set trgt = wb.Sheets("Sheet1")
    Dim url As String
    Dim zc As String
    Dim Name As String

Name = srch.Range("B2")
zc = srch.Range("F2")
url = "URL;http://www.yellowpages.com/"
url = url & "/" & zc & "/" & Name
RowCount = 1
trgt.Range("A" & RowCount) = "Name"
trgt.Range("B" & RowCount) = "Address"
trgt.Range("C" & RowCount) = "City"
trgt.Range("D" & RowCount) = "State"
trgt.Range("E" & RowCount) = "Zip"
eRow = Sheet1.Cells(Rows.Count, 1).End(xlUp).Offset(1, 0).Row
Set objIE = CreateObject("InternetExplorer.Application")
    With objIE
    .navigate "http://www.yellowpages.com/"
    .Visible = True
    Do While .Busy Or _
    .readyState <> 4
    DoEvents
    Loop
Set who = .document.getElementsByName("search_terms")
who.Item(0).Value = Name
Set where = .document.getElementsByName("geo_location_terms")
where.Item(0).Value = zc
.document.forms(0).submit
    Do While .Busy Or _
    .readyState <> 4
    DoEvents
    Loop
"Results = .document.getElementsByTagName("p")(0).innerText"
    For Each ele In .document.all
        Select Case ele.tagName
        Case Results
        RowCount = RowCount + 1
        Case "Name"
        trgt.Range("A" & RowCount) = ele.getElementByclass("business-name").innerText
        Case "Address"
        trgt.Range("B" & RowCount) = ele.getElementByclass("street-address").innerText
        Case "City"
        trgt.Range("C" & RowCount) = Trim(ele.getElementByclass("locality").innerText)
        Case "State"
        trgt.Range("D" & RowCount) = ele.getElementByitemprop("addressRegion").innerText
        Case "Zip"
        trgt.Range("E" & RowCount) = ele.getElementByitemprop("postalCode").innerText
        End Select
    Next ele
Set objIE = Nothing
End With
End Sub
4

1 回答 1

0

您想基本上从黄页搜索中抓取数据。

前段时间我做了一个有用的 Excel 插件来做这样的发现而不诉诸 VBA:http ://blog.tkacprow.pl/excel-scrape-html-add/

让我们从头开始 GET URL 结构是:

http://www.yellowpages.com/search?search_terms=[SEARCH_TERM]&geo_location_terms=[LOCATION]

其中 [SEARCH_TERM] 和 [LOCATION] 是您的 GET 参数。

现在,假设使用加载项中的函数,您想要获取类名为“business-name”的元素的文本,请使用以下函数:

=GetElementByRegex("http://www.yellowpages.com/search?search_terms=[SEARCH_TERM]&geo_location_terms=[LOCATION]"; "class=""business-name""[^<>]*?>((?:.|\n)*?)<[^<>]*?/")

没有 VBA,只有正则表达式。只需将 GET 参数替换为您自己的参数。当然,在不同元素的情况下,正则表达式可能会有所不同 - 但它仍然比从头开始编写 VBA 更简单。

希望这会有所帮助。

于 2014-10-10T19:19:38.033 回答