9

我想使用 Java 代码获取某些 Google 搜索引擎查询(在整个网络上)的估计结果计数。

我每天只需要做很少的查询,所以起初Google Web Search API虽然已被弃用,但似乎已经足够好了(例如,您可以如何搜索 Google Programmatically Java API)。但事实证明,这个 API 返回的数字与 www.google.com 返回的数字非常不同(参见例如http://code.google.com/p/google-ajax-apis/issues/detail?id =32 )。所以这些数字对我来说毫无用处。

我还尝试了Google Custom Search engine,它也出现了同样的问题。

你认为我的任务最简单的解决方案是什么?

4

2 回答 2

3

那么您可以做的就是以编程方式执行实际的 Google 搜索。最简单的方法是访问 URL https://www.google.com/search?q=QUERY_HERE,然后您想从该页面中删除结果计数。

这是一个如何做到这一点的快速示例:

    private static int getResultsCount(final String query) throws IOException {
    final URL url = new URL("https://www.google.com/search?q=" + URLEncoder.encode(query, "UTF-8"));
    final URLConnection connection = url.openConnection();
    connection.setConnectTimeout(60000);
    connection.setReadTimeout(60000);
    connection.addRequestProperty("User-Agent", "Mozilla/5.0");
    final Scanner reader = new Scanner(connection.getInputStream(), "UTF-8");
    while(reader.hasNextLine()){
        final String line = reader.nextLine();
        if(!line.contains("<div id=\"resultStats\">"))
            continue;
        try{
            return Integer.parseInt(line.split("<div id=\"resultStats\">")[1].split("<")[0].replaceAll("[^\\d]", ""));
        }finally{
            reader.close();
        }
    }
    reader.close();
    return 0;
}

对于使用,您可以执行以下操作:

final int count = getResultsCount("horses");
System.out.println("Estimated number of results for horses: " + count);
于 2013-08-13T15:07:08.823 回答
3
/**** @author RAJESH Kharche */
//open Netbeans
//Choose Java->prject
//name it GoogleSearchAPP

package googlesearchapp;

import java.io.*;
import java.net.*;
import java.util.*;
import java.util.logging.Level;
import java.util.logging.Logger;

public class GoogleSearchAPP {
    public static void main(String[] args) {
        try {
            // TODO code application logic here

            final int Result;

            Scanner s1=new Scanner(System.in);
            String Str;
            System.out.println("Enter Query to search: ");//get the query to search
            Str=s1.next();
            Result=getResultsCount(Str);

            System.out.println("Results:"+ Result);
        } catch (IOException ex) {
            Logger.getLogger(GoogleSearchAPP.class.getName()).log(Level.SEVERE, null, ex);
        }      
    }

    private static int getResultsCount(final String query) throws IOException {
        final URL url;
        url = new URL("https://www.google.com/search?q=" + URLEncoder.encode(query, "UTF-8"));
        final URLConnection connection = url.openConnection();

        connection.setConnectTimeout(60000);
        connection.setReadTimeout(60000);
        connection.addRequestProperty("User-Agent", "Google Chrome/36");//put the browser name/version

        final Scanner reader = new Scanner(connection.getInputStream(), "UTF-8");  //scanning a buffer from object returned by http request

        while(reader.hasNextLine()){   //for each line in buffer
            final String line = reader.nextLine();

            if(!line.contains("\"resultStats\">"))//line by line scanning for "resultstats" field because we want to extract number after it
                continue;

            try{        
                return Integer.parseInt(line.split("\"resultStats\">")[1].split("<")[0].replaceAll("[^\\d]", ""));//finally extract the number convert from string to integer
            }finally{
                reader.close();
            }
        }
        reader.close();
        return 0;
    }
}
于 2014-11-29T08:34:44.403 回答