我正在使用默认的 Woodstox EventReader 读取 XML 文件,例如:
XMLInputFactory.newInstance().createXMLEventReader(new FileInputStream(fileName));
如果输入文件碰巧在某些文本内容中有 Unicode NULL 字符,则会发生以下异常/堆栈跟踪:
WstxUnexpectedCharException.<init>(String, Location, char) line: 17
ValidatingStreamReader(StreamScanner).constructNullCharException() line: 604
ValidatingStreamReader(StreamScanner).throwInvalidSpace(int, boolean) line: 633
ValidatingStreamReader(BasicStreamReader).readTextSecondary(int, boolean) line: 4624
ValidatingStreamReader(BasicStreamReader).finishToken(boolean) line: 3661
ValidatingStreamReader(BasicStreamReader).next() line: 1063
WstxEventReader(Stax2EventReaderImpl).nextEvent() line: 255
我想避免验证文本内容。在 XMLInputFactory 上设置 IS_VALIDATING 并不能解决问题。
检查源代码后,看起来 BasicStreamReader 的 next() 引用“mValidateText”变量来确定是否验证。
从来源:
/**
* Flag that indicates that textual content (CDATA, CHARACTERS) is to
* be validated within current element's scope. Enabled if one of
* validators returns {@link XMLValidator#CONTENT_ALLOW_VALIDATABLE_TEXT},
* and will prevent lazy parsing of text.
*/
protected boolean mValidateText = false;
我似乎无法弄清楚如何在 InputFactory 或 EventReader 中更改/设置此值?也许我需要指示 InputFactory 不使用 ValidatingStreamReader,而是使用 TypedStreamReader?