2

我提出了一个 get 请求并将响应存储在 String 中response

HttpClient client = new DefaultHttpClient(); 
String getURL = "some_url_with_param_values";
HttpGet get = new HttpGet(getURL);
HttpResponse responseGet = client.execute(get);  
HttpEntity resEntityGet = responseGet.getEntity();   
String response = EntityUtils.toString(resEntityGet);

但我只对<div>具有类名的 s感兴趣<div class="product-data">。所以,我这样做了:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
InputSource is;
builder = factory.newDocumentBuilder();
is = new InputSource(new StringReader(xml));
Document doc = builder.parse(is);
NodeList list = doc.getElementsByTagName("product-data"); //I even tried: (div class="product-data)
String test = list.item(0).getNodeValue(); //Just to test it

不幸的是,它没有用。任何帮助将不胜感激。


我的响应字符串基本上是一个 html 页面。

<!DOCTYPE html .....
<html>
<head>
    //some script tags
</head>
<body>
    //some tags
    <div class="product-data">
        //some other tags
    </div>
    //some tags
    <div class="product-data">
        //some other tags
    </div>
    ....
</body>  
</html>                 
4

1 回答 1

3

我认为你应该尝试使用getElementsByClassName('product-data')

如果这不起作用,您可以随时检查Jsoup,它提供了一个库,提供了一种从网页中提取 Html 元素的简单方法

DefaultHttpClient client = new DefaultHttpClient();
HttpGet get = new HttpGet(url.toURI());
HttpResponse resp = client.execute(get);

String content = EntityUtils.toString(resp.getEntity());
Document doc = Jsoup.parse(content);
Elements ele = doc.select("div.classname");

这个例子执行一个 Http GET,然后提取所有具有类“classname”的 Div 元素,然后你可以做你喜欢的事情

于 2012-10-18T06:58:16.190 回答