0

我编写了 selenium/Java 脚本来找出网站上损坏的图像。在 34 张图片中,有 2 张图片抛出 505 错误代码(不支持 HTTP 版本)。

这是我的代码

    HttpURLConnection huc = null;
    int respCode = 200;
    huc = (HttpURLConnection) (new URL(url).openConnection());
    huc.setRequestMethod("HEAD");

    huc.setConnectTimeout(2000);
    huc.connect();
    respCode = huc.getResponseCode();

    if (respCode >= 400) {
        System.out.println(url + " is a broken with error code:" + respCode);
    } else {
        System.out.println(url + " is a good");
    }

505 Errors:
https://s7d4.scene7.com/is/image/DuPontCorteva/IMG-Soja-CampoDeSojaConFocoYDesenfoque-0_71-1 Desktop(new)?$callToActionCard_tablet$ is a broken with error code:505

https://s7d4.scene7.com/is/image/DuPontCorteva/IMG-Cenital-VistaCenitalDeCampo-0_71-1 Desktop(new)?$callToActionCard_desktop$ is a broken with error code:505


Few successful responses:
https://s7d4.scene7.com/is/image/DuPontCorteva/IMG-Campo-PersonasCaminandoEnElCampoConAtardecer-0_71-1 Desktop(new)?$callToActionCard_desktop$ is a good
https://s7d4.scene7.com/is/image/DuPontCorteva/IMG-Campo-PersonasCaminandoEnElCampoConAtardecer-0_71-1 Desktop(new)?$callToActionCard_tablet$ is a good
https://s7d4.scene7.com/is/image/DuPontCorteva/IMG-Campo-PersonasCaminandoEnElCampoConAtardecer-0_71-1 Desktop(new)?$callToActionCard_mobile$ is a good
https://s7d4.scene7.com/is/image/DuPontCorteva/IMG-Soja-CampoDeSojaConFocoYDesenfoque-0_71-1 Desktop(new)?$callToActionCard_desktop$ is a good 
4

2 回答 2

1

鉴于您使用 Selenium 和 Java,您应该“免费”拥有OkHttp库作为 Selenium传递依赖项的一部分。

因此,您可以修改图像检查逻辑,如下所示:

OkHttpClient client = new OkHttpClient().newBuilder().build();
Request request = new Request.Builder().url(url).method("HEAD", null).build();
Response response = client.newCall(request).execute();
int respCode = response.code();

OkHttp 客户端将自动处理URL 编码,因为您当前的请求由于URL 中不允许的特殊字符而失败

您可能还想获取浏览器 cookie并将它们添加到您的请求中,因为您的端点可能需要基于 cookie 的身份验证

于 2019-05-20T16:56:53.520 回答
1

问题在于 URL,它包含空格。您可以将 Java URL 编码器用于通用解决方案:查询字符串参数的 Java URL 编码

在您的情况下,只需按如下方式替换 %20 中的空间:

        String url = "https://s7d4.scene7.com/is/image/DuPontCorteva/IMG-Soja-CampoDeSojaConFocoYDesenfoque-0_71-1%20Desktop(new)?$callToActionCard_tablet$";
    HttpURLConnection huc = null;
    int respCode = 200;
    huc = (HttpURLConnection) (new URL(url).openConnection());
    huc.setRequestMethod("HEAD");

    huc.setConnectTimeout(2000);
    huc.connect();
    respCode = huc.getResponseCode();

    if (respCode >= 400) {
        System.out.println(url + " is a broken with error code:" + respCode);
    } else {
        System.out.println(url + " is a good");
    }

输出:

https://s7d4.scene7.com/is/image/DuPontCorteva/IMG-Soja-CampoDeSojaConFocoYDesenfoque-0_71-1%20Desktop(new)?$callToActionCard_tablet$ is a good
于 2019-05-20T16:42:55.727 回答