0

我想使用 HttpClient 的类来连续提取多个词条的谷歌点击次数,但是谷歌服务器不让我重复做这个操作,你能帮我吗?这是我的程序,参数Concept是我要搜索的词。</p>

public static double extractGoogleCount(String Concept)
    {
    double temp = 0;
    HttpClient httpClient = new HttpClient();
    String url = "http://www.google.com/search?hl=en&newwindow=1&q="
        + Concept + "&aq=f&aqi=&aql=&oq=&gs_rfai=";
    GetMethod getMethod = new GetMethod(url);
    getMethod.getParams().setParameter(HttpMethodParams.RETRY_HANDLER,
        new DefaultHttpMethodRetryHandler());
    try
    {
        int statusCode = httpClient.executeMethod(getMethod);
        if (statusCode != HttpStatus.SC_OK)
        {
            System.err.println("Method failed: "
                + getMethod.getStatusLine() + url);
        }
        InputStream responseBody = getMethod.getResponseBodyAsStream();
        DataInputStream dis = new DataInputStream(responseBody);
        String returnPage = dis.readLine();
        while (returnPage != null)
        {
            int index = returnPage.indexOf("<div id=\"resultStats\">");
            if (index == -1)
            {
            returnPage = dis.readLine();
            continue;
            }
            String sub = returnPage.substring(index, index + 100);
            if (sub.indexOf("About") >= 0)
            {

            String[] result = sub.split(" ");
            String number = result[2].replaceAll(",", "");
            temp = Double.parseDouble(number);
            } else
            {
            String[] result = sub.split(" ");
            String number = result[1].substring(result[1]
                .indexOf(">") + 1);
            System.out.println("number:" + number);
            temp = Double.parseDouble(number);
            }
            break;
        }

        return temp;
    } catch (HttpException e)
    {
        System.out.println("Please check your provided http address!");
        e.printStackTrace();
    } catch (IOException e)
    {
        e.printStackTrace();
    } 
    catch (Exception e)
    {
        e.printStackTrace();
        return temp;
    } finally
    {
        httpClient.getState().clear();
        getMethod.releaseConnection();

    }

    }
4

1 回答 1

0

Google 仅允许每秒来自单个客户端的一定数量的请求。尝试添加:

 Thread.sleep(200);

到代码,它应该可以工作。您可能想要创建另一个线程来完成获取工作,以便在需要以某种方式显示此数据时可以对程序执行其他操作

于 2013-04-02T08:39:44.023 回答