首先,您必须从收到的文本块中提取正确的 XML。
这归结为两个操作:
可以通过预先使用正则表达式处理原始文本来执行此任务。在您的情况下,可以使用这些表达式。
public class XMLTest {
static String data = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" + "\n" + " Info: POST /Remindz_api/user/loginHTTP/1.1\n"
+ " Host: www.narola.co \n" + " Accept: www.narola.co.beepz.api+xml\n" + " HTTP 1.1 200 OK \n"
+ " Content-Type: www.narola.co.beepz.api+xml;\n" + " Allow : GET,POST\n" + "\n" + " <user id=\"43\">\n"
+ " <firstname>Dfdf</firstname>\n" + " <lasttname>p2</lasttname>\n" + " <email>p</email>\n"
+ " <telephone>2236</telephone>\n" + " <created_on>2013-01-04 04:38:05</created_on>\n"
+ " <atom:link <a href=\"http://www.narola.co/remindz/reminders/43\"></a> />\n" + " </user>";
public static void main(final String[] args) {
/*
* This strips off "Param:Value"-style lines
*/
String xmlData = data.replaceAll(" *[a-z\\-A-Z]* *:[^<]*\n", "");
/*
* This strips off "HTTP line"
*/
xmlData = xmlData.replaceAll(" *HTTP .*\n", "");
/*
* Correct atom:link format
*/
xmlData = xmlData.replaceAll("<atom:link (.*) />", "<atom:link>$1</atom:link>");
try {
DocumentBuilder newDocumentBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = newDocumentBuilder.parse(new ByteArrayInputStream(xmlData.getBytes("UTF-8")));
Element elem = doc.getDocumentElement();
dump("", elem);
}
catch (Exception e) {
e.printStackTrace();
}
}
public static void dump(final String pad, final Node node)
{
System.out.println(pad + node.toString());
if(node.getChildNodes() != null)
{
for(int i=0; i< node.getChildNodes().getLength();i++)
{
dump(pad + " ", node.getChildNodes().item(i));
}
}
}
生成的文本是完美的有效 XML,无法输入 DOM 解析器:
<?xml version="1.0" encoding="UTF-8"?>
<user id="43">
<firstname>Dfdf</firstname>
<lasttname>p2</lasttname>
<email>p</email>
<telephone>2236</telephone>
<created_on>2013-01-04 04:38:05</created_on>
<atom:link><a href="http://www.narola.co/remindz/reminders/43"></a></atom:link>
</user>