使用 SAX 解析器时,如果节点内容中有 " 则解析失败。我该如何解决这个问题?是否需要转换所有 " 字符?
换句话说,只要我在节点中有报价:
<node>characters in node containing "quotes"</node>
当 Handler 解析它时,该节点会被分割成多个字符数组。这是正常行为吗?为什么引号会导致这样的问题?
这是我正在使用的代码:
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.xml.sax.InputSource;
import org.xml.sax.XMLReader;
...
HttpGet httpget = new HttpGet(GATEWAY_URL + "/"+ question.getId());
httpget.setHeader("User-Agent", PayloadService.userAgent);
httpget.setHeader("Content-Type", "application/xml");
HttpResponse response = PayloadService.getHttpclient().execute(httpget);
HttpEntity entity = response.getEntity();
if(entity != null)
{
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
XMLReader xr = sp.getXMLReader();
ConvoHandler convoHandler = new ConvoHandler();
xr.setContentHandler(convoHandler);
xr.parse(new InputSource(entity.getContent()));
entity.consumeContent();
messageList = convoHandler.getMessageList();
}