3

我试图在我的 java 程序中访问这个 url,但我收到了这个奇怪的消息,而不是我期望的页面内容。

我怎样才能避免这种情况?

<!DOCTYPE html PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html>
 <head> 
  <title>303 See Other</title> 
 </head>
 <body> 
  <h1>See Other</h1> 
  <p>The answer to your request is located <a href="https://www.wikidata.org/wiki/Special:EntityData/P26">here</a>.</p>  
 </body>
</html>

虽然我可以在浏览器中轻松导航。是否有一些函数或库可以用来从我的 java 程序中调用该功能?

for (String url : list_of_relation_URLs) 
{
    //System.out.println( url );

    //go to relation url
    String URL_czech = url;

    System.out.println( url  );

    URL wikidata_page = new URL(URL_czech);
    HttpURLConnection wiki_connection = (HttpURLConnection)wikidata_page.openConnection();
    InputStream wikiInputStream = null;

    try 
    {
        // try to connect and use the input stream
        wiki_connection.connect();
        wikiInputStream = wiki_connection.getInputStream();
    } 
    catch(IOException error) 
    {
        // failed, try using the error stream
        wikiInputStream = wiki_connection.getErrorStream();
    }

    // parse the input stream using Jsoup
    Document docx = Jsoup.parse(wikiInputStream, null, wikidata_page.getProtocol()+"://"+wikidata_page.getHost()+"/");

    System.out.println( docx.toString() );  
}

我正在尝试做与这里发生的事情相反的事情

4

2 回答 2

1

当您收到 303 状态代码时,您只需向 303 提供的 URL 发出第二次请求。

新 URL 存储在Location标头中。

在您的情况下,您需要继续关注,直到您获得不同的状态代码,因为您将被重定向两次。

303:位置:“ https://www.wikidata.org/wiki/Special:EntityData/P26

303:位置:“ https://www.wikidata.org/wiki/Property:P26

是的...如果您使用的是 aHttpURLConnection您可以要求它为您执行此操作。

conn.setInstanceFollowRedirects(true);
于 2015-04-28T13:52:49.140 回答
0

这是完美的答案

try {

String url = "http://www.twitter.com";

URL obj = new URL(url);
HttpURLConnection conn = (HttpURLConnection) obj.openConnection();
conn.setReadTimeout(5000);
conn.addRequestProperty("Accept-Language", "en-US,en;q=0.8");
conn.addRequestProperty("User-Agent", "Mozilla");
conn.addRequestProperty("Referer", "google.com");

System.out.println("Request URL ... " + url);

boolean redirect = false;

// normally, 3xx is redirect
int status = conn.getResponseCode();
if (status != HttpURLConnection.HTTP_OK) {
    if (status == HttpURLConnection.HTTP_MOVED_TEMP
        || status == HttpURLConnection.HTTP_MOVED_PERM
            || status == HttpURLConnection.HTTP_SEE_OTHER)
    redirect = true;
}

System.out.println("Response Code ... " + status);

if (redirect) {

    // get redirect url from "location" header field
    String newUrl = conn.getHeaderField("Location");

    // get the cookie if need, for login
    String cookies = conn.getHeaderField("Set-Cookie");

    // open the new connnection again
    conn = (HttpURLConnection) new URL(newUrl).openConnection();
    conn.setRequestProperty("Cookie", cookies);
    conn.addRequestProperty("Accept-Language", "en-US,en;q=0.8");
    conn.addRequestProperty("User-Agent", "Mozilla");
    conn.addRequestProperty("Referer", "google.com");

    System.out.println("Redirect to URL : " + newUrl);

}

BufferedReader in = new BufferedReader(
                          new InputStreamReader(conn.getInputStream()));
String inputLine;
StringBuffer html = new StringBuffer();

while ((inputLine = in.readLine()) != null) {
    html.append(inputLine);
}
in.close();

System.out.println("URL Content... \n" + html.toString());
System.out.println("Done");
于 2015-04-28T14:13:12.290 回答