java - 带有 https 方案的 URL 中的 Apache HttpClient 和远程文件

Question

我使用的是 4.2.5 版。来自 org.apache.httpcomponents的AutoRetryHttpClient从方案为https的 url 下载 pdf 文件。代码用 NetBeans 7.3 编写并使用 JDK7。

假设虚构的 pdf 资源位于https://www.thedomain.with/my_resource.pdf，那么我有以下代码：

SchemeRegistry registry = new SchemeRegistry();
    try {
        final SSLSocketFactory sf = new SSLSocketFactory(new TrustStrategy() {
            @Override
            public boolean isTrusted(X509Certificate[] chain, String authType)
                    throws CertificateException {
                return true;
            }
        });

        registry.register(new Scheme("https", 3920, sf));            
    } catch (NoSuchAlgorithmException | KeyManagementException | KeyStoreException | UnrecoverableKeyException ex) {
        Logger.getLogger(HttpConnection.class.getName()).log(Level.SEVERE, null, ex);
    }        
    //Here I create the client.
    HttpClient client = new AutoRetryHttpClient(new DefaultHttpClient(new PoolingClientConnectionManager(registry)),
            new DefaultServiceUnavailableRetryStrategy(5, //num of max retries
               100//retry interval)); 

        HttpResponse httpResponse = null;
        try {
            HttpGet httpget = new HttpGet("https://www.thedomain.with/my_resource.pdf");
            //I set header and Mozilla User-Agent
            httpResponse = client.execute(httpget);
        } catch (IOException ex) {
        }
        ... //other lines of code to get and save the file, not really important since the code is never reached

当我调用client.execute以下异常时抛出

org.apache.http.conn.HttpHostConnectException: Connection to https://www.thedomain.with refused

我该怎么做才能获得该 pdf 资源？

PS：我可以通过浏览器下载它，所以存在一种获取该文件的方法。

score 0 · Accepted Answer

似乎有几个问题：

您将 Scheme 注册为使用 3920 作为默认端口，这是 HTTPS 的非标准端口号。如果服务器实际上在该端口上运行，那么您必须在浏览器中使用此 URL 进行访问：https://www.thedomain.with:3920/my_resource.pdf. 由于您在浏览器中使用的 URL 不包含 3920 端口，因此服务器将在默认端口 443 上运行，因此您应该使用 change new Scheme("https", 3920, sf)to new Scheme("https", 443, sf)。
您的服务器证书中的 CN 似乎与其主机名不匹配，这导致SSLPeerUnverifiedException. 为了使其工作，您需要使用SSLSocketFactory(TrustStrategy, HostnameVerifier)构造函数并传递不执行此检查的验证器。Apache 提供了AllowAllHostnameVerifier用于此目的。

注意：您真的不应该在生产代码中使用无操作的 TrustStrategy 和 HostnameVerifier，因为这实际上会关闭所有与远程服务器身份验证相关的安全检查，并使您容易受到模拟攻击。

java - 带有 https 方案的 URL 中的 Apache HttpClient 和远程文件

1 回答 1

Related

Reference