问问题
12498 次
3 回答
5
It works for me.
private static String getWebPabeSource(String sURL) throws IOException {
URL url = new URL(sURL);
URLConnection urlCon = url.openConnection();
BufferedReader in = null;
if (urlCon.getHeaderField("Content-Encoding") != null
&& urlCon.getHeaderField("Content-Encoding").equals("gzip")) {
in = new BufferedReader(new InputStreamReader(new GZIPInputStream(
urlCon.getInputStream())));
} else {
in = new BufferedReader(new InputStreamReader(
urlCon.getInputStream()));
}
String inputLine;
StringBuilder sb = new StringBuilder();
while ((inputLine = in.readLine()) != null)
sb.append(inputLine);
in.close();
return sb.toString();
}
于 2013-10-10T10:46:10.127 回答
2
Try reading it this way:
private static String getUrlSource(String url) throws IOException {
URL url = new URL(url);
URLConnection urlConn = url.openConnection();
BufferedReader in = new BufferedReader(new InputStreamReader(
urlConn.getInputStream(), "UTF-8"));
String inputLine;
StringBuilder a = new StringBuilder();
while ((inputLine = in.readLine()) != null)
a.append(inputLine);
in.close();
return a.toString();
}
and set your encoding according to the web page - notice this line:
BufferedReader in = new BufferedReader(new InputStreamReader(
urlConn.getInputStream(), "UTF-8"));
于 2013-10-10T10:42:14.903 回答
0
First you have to uncompress the content using GZIPInputStream. Then put the uncompressed stream to Input Stream and read it using BufferedReader
Use Apache HTTP Client 4.1.1
Maven dependency
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.1.1</version>
</dependency>
Sample Code to parse gzip content.
package com.gzip.simple;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.zip.GZIPInputStream;
import org.apache.http.Header;
import org.apache.http.HttpResponse;
import org.apache.http.client.ClientProtocolException;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.DefaultHttpClient;
public class GZIPFetcher {
public static void main(String[] args) {
try {
DefaultHttpClient httpClient = new DefaultHttpClient();
HttpGet getRequest = new HttpGet("http://excite.com/education");
getRequest.addHeader("accept", "application/json");
HttpResponse response = httpClient.execute(getRequest);
if (response.getStatusLine().getStatusCode() != 200) {
throw new RuntimeException("Failed : HTTP error code : "
+ response.getStatusLine().getStatusCode());
}
InputStream instream = response.getEntity().getContent();
// Check whether the content-encoding is gzip or not.
Header contentEncoding = response
.getFirstHeader("Content-Encoding");
if (contentEncoding != null
&& contentEncoding.getValue().equalsIgnoreCase("gzip")) {
instream = new GZIPInputStream(instream);
}
BufferedReader in = new BufferedReader(new InputStreamReader(
instream));
String content;
System.out.println("Output from Server .... \n");
while ((content = in.readLine()) != null)
System.out.println(content);
httpClient.getConnectionManager().shutdown();
} catch (ClientProtocolException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}
于 2013-10-10T10:42:25.097 回答