0

我的目标是抓取网页的源代码。

该网站似乎有不同的框架,这就是我的代码无法正常工作的原因。

我试图修改我在网上找到的应该解决框架问题的代码。

以下代码在以下位置创建错误(需要对象):

Set profileFrame .document.getElementById("profileFrame")

Public Sub IE_Automation()

 'Needs references to Microsoft Internet Controls and Microsoft HTML Object Library

Dim baseURL As String
Dim IE As InternetExplorer
Dim HTMLdoc As HTMLDocument
Dim profileFrame As HTMLIFrame
Dim slotsDiv As HTMLDivElement

'example URL with multiple frames
baseURL = "https://www.xing.com/search/members?section=members&keywords=IT&filters%5Bcontact_level%5D=non_contact"

Set IE = New InternetExplorer
With IE
    .Visible = True

     'Navigate to the main page

    .navigate baseURL & "/publictrophy/index.htm?onlinename=ace_anubis"
    While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend

     'Get the profileFrame iframe and navigate to it

    Set profileFrame = .document.getElementById("profileFrame")

    .navigate baseURL & profileFrame.src
    While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend

    Set HTMLdoc = .document
End With

 'Display all the text in the profileFrame iframe

MsgBox HTMLdoc.body.innerText

'Display just the text in the slots_container div

Set slotsDiv = HTMLdoc.getElementById("slots_container")
MsgBox slotsDiv.innerText

End Sub
4

2 回答 2

0

一般的:

我认为在您的研究中,您可能遇到过这个问题并误解了它与您的情况的关系/不相关。

我认为 iFrame 与您的查询无关。如果您想查看姓名列表、他们的详细信息以及他们页面的 URL,您可以使用下面的代码。


CSS 选择器

为了定位感兴趣的元素,我使用了以下两个CSS 选择器。这些使用页面上的样式信息来定位元素:

.SearchResults-link
.SearchResults-item

"."意思是类,就像说.getElementsByClassName。第一个获取链接,第二个获取第一页的描述信息。

关于第一个 CSS 选择器:所需的实际链接是动态构建的,但我们可以使用实际配置文件 URL 具有公共基本字符串的事实,然后是配置文件"https://www.xing.com/profile/"名称。因此,在 function 中GetURL,我们解析 CSS 选择器返回的 outerHTML 以获取 profileName 并将其与BASESTRING常量连接以获取我们的实际配置文件链接。


代码:

Option Explicit
Public Sub GetInfo()
    Dim IE As New InternetExplorer
    With IE
        .Visible = True
        .navigate "https://www.xing.com/publicsearch/query?search%5Bq%5D=IT"

        While .Busy Or .readyState < 4: DoEvents: Wend

        Dim a As Object, exitTime As Date, linksNodeList As Object, profileNodeList As Object

'        exitTime = Now + TimeSerial(0, 0, 5) '<== uncomment this section if timing problems
'
'        Do
'            DoEvents
'            On Error Resume Next
'            Set linksNodeList = .document.querySelectorAll(".SearchResults-link")
'            On Error GoTo 0
'            If Now > exitTime Then Exit Do
'        Loop While linksNodeList Is Nothing

        Set linksNodeList = .document.querySelectorAll(".SearchResults-link") '<== comment this out if uncommented section above
        Set profileNodeList = .document.querySelectorAll(".SearchResults-item")

        Dim i As Long
        For i = 0 To profileNodeList.Length - 1
            Debug.Print "Profile link: " & GetURL(linksNodeList.item(i).outerHTML)
            Debug.Print "Basic info: " & profileNodeList.item(i).innerText
        Next i
    End With
End Sub

Public Function GetURL(ByVal htmlSection As String) As String
    Const BASESTRING As String = "https://www.xing.com/profile/"
    Dim arr() As String
    arr = Split(htmlSection, "/")
    GetURL = BASESTRING & Replace$(Split((arr(UBound(arr) - 1)), ">")(0), Chr$(34), vbNullString)
End Function

示例返回信息:

输出示例

于 2018-06-10T07:50:07.410 回答
0

嗯,我不确定你在这里做什么,但你能试试下面的代码吗?

Option Explicit

Sub Sample()
    Dim ie As Object
    Dim links As Variant, lnk As Variant
    Dim rowcount As Long

    Set ie = CreateObject("InternetExplorer.Application")
    ie.Visible = True
    ie.navigate "https://www.xing.com/search/members?section=members&keywords=IT&filters%5Bcontact_level%5D=non_contact"

    'Wait for site to fully load
    'ie.Navigate2 URL
    Do While ie.Busy = True
       DoEvents
    Loop

    Set links = ie.document.getElementsByTagName("a")

    rowcount = 1

    With Sheets("Sheet1")
        For Each lnk In links
        'Debug.Print lnk.innerText
            'If lnk.classname Like "*Real Statistics Examples Part 1*" Then
                .Range("A" & rowcount) = lnk.innerText
                rowcount = rowcount + 1
                'Exit For
            'End If
        Next
    End With
End Sub
于 2017-06-16T19:30:01.127 回答