0

我正在尝试写入一个从 excel 数据中删除 html 标签的宏。我只想搜索 <*> 这个模式并将它们替换为空白。还需要删除像'“'这样的特殊字符和像 if(typeof(dstb)!= "undefined"){ dstb();} 这样的标签。

到目前为止,我编写的代码要求我在宏中硬编码文件名,这是我不想要的。

代码 :

 Sub UnescapeCharacters()
 sheetname = "2011 Publications" 'file name goes here

Dim sheet As Worksheet
Set sheet = Me.Worksheets(sheetname)

For Row = 1 To sheet.UsedRange.Rows.Count
    For Column = 1 To sheet.UsedRange.Columns.Count
        Dim cell As Range
        Set cell = sheet.Cells(Row, Column)


        ReplaceCharacter cell, "&quot;", """" 
        ReplaceCharacter cell, "&#44;", ""
        ReplaceCharacter cell, "&nbsp;", ""
        ReplaceCharacter cell, "&bull;", ""

        ReplaceCharacter cell, "</ul>", ""
        ReplaceCharacter cell, "<ul>", ""

        ReplaceCharacter cell, "<b>", ""
        ReplaceCharacter cell, "</b>", ""

        ReplaceCharacter cell, "<i>", ""
        ReplaceCharacter cell, "</i>", ""

        ReplaceCharacter cell, "</li>", ""
        ReplaceCharacter cell, "<li>", ""

        ReplaceCharacter cell, "</br>", ""
        ReplaceCharacter cell, "<br />", ""

        ReplaceCharacter cell, "</p>", ""
        ReplaceCharacter cell, "<p>", ""

    Next Column
Next Row

End Sub

Sub ReplaceCharacter(ByRef cell As Range, ByVal find As String, ByVal replacement As     String)

Dim result As String
cell.Value = Replace(cell.Text, find, replacement, 1, -1)

End Sub

有人可以帮忙吗?

4

1 回答 1

1

我更喜欢用这种方法来做。如果 HTML 不是真正的网站,您可以将 HTML 保存为文件,然后 IE.navigate 该文件路径。

   Sub testing()
    Dim IE As Object
    Dim stringWithOutTags As String
    Set IE = CreateObject("InternetExplorer.Application")
        ' HardCode the URL address in
        IE.navigate "http://stackoverflow.com/questions/13824872/writing-macro-in-excel-to-remove-html-code"
        Do While IE.Busy
        Loop
        Do While IE.readyState <> 4
        Loop


        stringWithOutTags = IE.document.DocumentElement.innerText

        IE.Quit
    End Sub
于 2012-12-12T07:48:01.963 回答