我需要解析其中包含许多无效字符的 XML 文件。这是我用来解析文件并替换无效字符的 VB6/VBA 代码:
Dim xmldoc As MSXML2.DOMDocument
Dim xmlNode As MSXML2.IXMLDOMNode
Dim xmlNodeList As MSXML2.IXMLDOMNodeList
dim XML as string
dim fno as integer
' get the XML file
fno = FreeFile
Open "input.xml" For Input As #fno
XML = Input(LOF(fno), fno)
Close #fno
TOP_OF_CODE:
Set xmldoc = New MSXML2.DOMDocument60
xmldoc.LoadXML XML
Set xmlNodeList = xmldoc.getElementsByTagName("*")
For Each xmlNode In xmlNodeList
(a bunch of code to parse the XML)
Next xmlNode
If xmldoc.parseError.errorCode <> 0 And xmldoc.parseError.reason = "An invalid character was found in text content." & vbCrLf Then
' invalid character was found
ptr = xmldoc.parseError.filepos
XML = Left(XML, ptr - 1) & "x" & Mid(XML, ptr + 1)
set xmldoc = Nothing
GoTo TOP_OF_CODE
end if
大多数情况下,代码完全按照预期工作:迭代删除每个无效字符,然后进行解析。然而,有时事情似乎会“卡住”:每次它在同一位置检测到一个无效字符,即使我已经用一个有效字符替换了无效字符。我尝试插入各种字符来替换无效字符,并且还简单地删除了该字符位置。我仍然在同一个地方收到无效字符错误。有什么线索吗?