好的,这是目标网页:http: //dnd.arkalseif.info/items/index.html_page=27
这是我当前的代码:
Sub GetItemsList()
' This macro uses manually entered links to scrap the content of the target page.
' It does not (yet) capture hyperlinks, it only grabs text.
Dim ie As Object
Dim retStr As String
Dim sht As Worksheet
Dim LastRow As Long
Dim rCell As Range
Dim rRng As Range
Dim Count As Long
Dim Status As String
Dim BadCount As Long
Set sht = ThisWorkbook.Worksheets("List")
BadCount = 0
LastRow = sht.Cells(sht.Rows.Count, "A").End(xlUp).Row
Set ie = CreateObject("internetexplorer.application")
Set rRng = sht.Range("b1:b" & LastRow)
Status = "Starting at row "
For Each rCell In rRng.Cells
Count = rCell.Row
Application.StatusBar = BadCount & " dead links so far. " & Status & Count & "of " & LastRow & "."
Wait 1
If rCell = "" Then
With ie
.Navigate rCell.Offset(0, -1).Value
.Visible = False
End With
Do While ie.Busy
DoEvents
Loop
Wait 1
On Error GoTo ErrHandler
' rCell.Value = ie.Document.getElementById("content").innerText
rCell.Value = ie.Document.getElementsByClassName("common").innerText
rCell.WrapText = False
Status = "This row successfully scraped. Moving on to row "
Application.StatusBar = BadCount & " dead links so far. " & Status & Count + 1 & "of " & LastRow & "."
Status = "Previous row succeded. Now at row "
98 Wait 1
End If
Next rCell
If BadCount > 0 Then
Application.StatusBar = "Macro finshed running with " & BadCount & " errors."
Else
Application.StatusBar = "Finished."
End If
Exit Sub
ErrHandler:
rCell.Value = ""
Status = "Previous row failed. Moving on to row "
BadCount = BadCount + 1
Application.StatusBar = "This row is a dead link. " & BadCount & " dead links so far. Moving on to row " & Count + 1 & "of " & LastRow & "."
Resume 98
End Sub
(尝试忽略我所有的 StatusBar 更新,此代码最初是用于超链接的 looooong 列表,我需要(当时)知道什么时候出现问题)
现在,注释掉的行起作用了,因为它从div id Content中获取了整个文本。但是我想获取嵌套在表的第一列中的超链接,该表嵌套在表的第一列中div id(这就是下一行的用途)。但它只是失败了。Excel 什么也不做,将其视为错误,然后继续执行下一个链接。
我想我需要告诉ExcelTable class 在. Div id但我不知道该怎么做,我也无法弄清楚。
感谢大家。
