0
String html = Request.Get("https://kokos.pl/")
        .execute().returnContent().asString();

System.out.println(html);

我在第 12 行得到的是:

<title>Szybkie po??yczki got??wkowe, po??yczki spo??eczno??ciowe - Kokos.pl</title>

虽然它应该是:

<title>Szybkie pożyczki gotówkowe, pożyczki społecznościowe - Kokos.pl</title>
4

1 回答 1

1
[DEBUG] DefaultClientConnection - Sending request: GET / HTTP/1.1
[DEBUG] headers - >> GET / HTTP/1.1
[DEBUG] headers - >> Host: kokos.pl
[DEBUG] headers - >> Connection: Keep-Alive
[DEBUG] headers - >> User-Agent: Apache-HttpClient/4.2.5 (java 1.5)
[DEBUG] DefaultClientConnection - Receiving response: HTTP/1.1 200 OK
[DEBUG] headers - << HTTP/1.1 200 OK
[DEBUG] headers - << Server: nginx
[DEBUG] headers - << Date: Thu, 01 Aug 2013 12:04:12 GMT
[DEBUG] headers - << Content-Type: text/html
[DEBUG] headers - << Connection: keep-alive
...

服务器针对此 URI 返回的响应消息未明确指定内容的字符集。在这种情况下,HttpClient 被迫对 HTTP 内容使用默认字符集编码,即 isISO-8859-1和 not UTF-8

不幸的是,覆盖 fluent API 使用的默认内容字符集的唯一方法是使用自定义响应处理程序

ResponseHandler<String> myHandler = new ResponseHandler<String>() {
    @Override
    public String handleResponse(
            final HttpResponse response) throws IOException {
        return EntityUtils.toString(response.getEntity(), Consts.UTF_8);
    }
};

String html = Request.Get("https://kokos.pl/").execute().handleResponse(myHandler);

System.out.println(html);
于 2013-08-01T12:16:38.730 回答