1

我想知道是否有人可以告诉我如何从 Excel - VB 中的以下字符串推断“http://www.nbc.com/xyz”和“我喜欢这个节目”。

谢谢

<a href="http://www.nbc.com/xyz" >I love this show</a><IMG border=0 width=1 height=1 src="http://ad.linksynergy.com/fs-bin/show?id=Loe5O5QVFig&bids=261463.100016851&type=3&subid=0" >
4

2 回答 2

4
Sub Tester()
    '### add a reference to "Microsoft HTML Object Library" ###
    Dim odoc As New MSHTML.HTMLDocument
    Dim el As Object
    Dim txt As String

    txt = "<a href=""http://www.nbc.com/xyz"" >I love this show</a>" & _
         "<IMG border=0 width=1 height=1 " & _
         "src=""http://ad.linksynergy.com/fs-bin/show?" & _
         "id=Loe5O5QVFig&bids=261463.100016851&type=3&subid=0"" >"

    odoc.body.innerHTML = txt

    Set el = odoc.getElementsByTagName("a")(0)
    Debug.Print el.innerText
    Debug.Print el.href

End Sub
于 2012-10-04T19:54:31.613 回答
0

曾经的方法是使用正则表达式。另一种方法是使用 Split 在各种分隔符上拆分字符串,例如

Option Explicit

Sub splitMethod()
Dim Str As String

    Str = Sheet1.Range("A1").Value
    Debug.Print Split(Str, """")(1)
    Debug.Print Split(Split(Str, ">")(1), "</a")(0)

End Sub

Sub RegexMethod()
Dim Str As String
Dim oRegex As Object
Dim regexArr As Object
Dim rItem As Object

    'Assumes Sheet1.Range("A1").Value holds example string
    Str = Sheet1.Range("A1").Value

    Set oRegex = CreateObject("vbscript.regexp")
    With oRegex
        .Global = True
        .Pattern = "(href=""|>)(.+?)(""|</a>)"
        Set regexArr = .Execute(Str)

        'No lookbehind so replace unwanted chars
        .Pattern = "(href=""|>|""|</a>)"
        For Each rItem In regexArr
            Debug.Print .Replace(rItem, vbNullString)
        Next rItem
    End With
End Sub

'Output:
'http://www.nbc.com/xyz
'I love this show

这匹配href=">在字符串的开头,"</a>在字符串的末尾,中间有任何字符(\n 换行符除外) (.+?)

于 2012-10-04T19:36:44.993 回答