在我的 XML 中,我有一个多行元素:
<tag id="sometag" ...>
| first line
| second line
| third line
| fourth line
<tag ...>
....
<tag id="someothertag" ...>
| ANOTHER FIRST LINE
| ANOTHER SECOND LINE
| ANOTHER THIRD LINE
| ANOTHER FORTH LINE
<tag ...>
然后在 Java 中我有必要startElement
的endElement
, 和characters
方法,但我发现我有一些奇怪的行为characters
:
public void characters(char[] ch, int start, int length){
Log.d(TAG, "characters( "\"" + (new String(ch)).replaceAll("[\r\n]", "\\n") + "\", " + start + ", " + length + " )");
}
除此之外,我对角色什么都不做。我基本上是在创建解析器的两个实例。在一个实例中,我正在搜索sometag
. 如果我找到我要查找的内容并返回该元素,我会抛出异常。
D/MyProgram( 1565): STARTING document parsing...
D/MyProgram( 1565): characters( "n ", 0, 1 )
D/MyProgram( 1565): characters( " | first line", 0, 20 )
D/MyProgram( 1565): characters( "n | first line", 0, 1 )
D/MyProgram( 1565): characters( " | second line", 0, 23 )
D/MyProgram( 1565): characters( "n | second line", 0, 1 )
D/MyProgram( 1565): characters( " | third line", 0, 26 )
D/MyProgram( 1565): characters( "n | third line", 0, 1 )
D/MyProgram( 1565): characters( " | fourth lineline", 0, 22 )
D/MyProgram( 1565): characters( "n | fourth lineline", 0, 1 )
D/MyProgram( 1565): characters( " | fourth lineline", 0, 4 )
D/MyProgram( 1565): Successfully found "sometag"!
...以及我正在寻找的另一个全新的实例someothertag
。我做和以前一样的事情。
D/MyProgram( 1565): STARTING document parsing...
D/MyProgram( 1565): characters( "n", 0, 1 )
D/MyProgram( 1565): characters( " ", 0, 4 )
D/MyProgram( 1565): characters( "n ", 0, 1 )
D/MyProgram( 1565): characters( " | first line", 0, 20 )
D/MyProgram( 1565): characters( "n | first line", 0, 1 )
D/MyProgram( 1565): characters( " | second line", 0, 23 )
D/MyProgram( 1565): characters( "n | second line", 0, 1 )
D/MyProgram( 1565): characters( " | third line", 0, 26 )
D/MyProgram( 1565): characters( "n | third line", 0, 1 )
D/MyProgram( 1565): characters( " | fourth lineline", 0, 22 )
D/MyProgram( 1565): characters( "n | fourth lineline", 0, 1 )
D/MyProgram( 1565): characters( " | fourth lineline", 0, 4 )
D/MyProgram( 1565): Successfully found "someothertag"!
我知道 XML 解析是基于流的(它解析块而不是整个字符串),但这是非常奇怪的行为。以下是我注意到的一些非常令人困惑的事情:
- 对于 characters() 的每次迭代,如果解析器确实完成了解析,则解析器不会从它停止的地方开始或完成字符:我什至得到了第一个 char 数组之前
n
的字符(' ',这是一个替换换行符)。 ch
有原本不存在的额外字符:“line
”附加到“forth line
”。- 当我创建一个全新的解析器实例时,这些字符被“重新读取”。第二次执行应该是这样的:
..这个...
D/MyProgram( 1565): characters( "n", 0, 1 )
D/MyProgram( 1565): characters( " ", 0, 4 )
D/MyProgram( 1565): characters( "n ", 0, 1 )
D/MyProgram( 1565): characters( " | ANOTHER FIRST LINE", 0, 20 )
D/MyProgram( 1565): characters( "n | ANOTHER SECOND LINE", 0, 1 )
... 等等。
知道我做错了什么吗?提前致谢。