1

使用:Rails 3.1.1

我正在使用 googleajax gem 在包含数千次搜索的脚本中执行 Google 搜索。

在大约 20 次搜索之后,我需要等待并重试的救援,因为您似乎无法连续执行超过一定数量的搜索。大约一分钟后,重试会使搜索继续进行 10 次以上的搜索。结果是执行 10 次搜索大约需要一分钟,这使得脚本非常慢。

谷歌似乎有可能限制一个人可以执行的搜索量(基于 ip?基于 googleajax 推荐人?),但有没有办法解决这个问题?

我可以做些什么才能通过 googleajax gem 执行 Google 搜索,而不必一直暂停和等待?我有什么选择?

代码(去掉了不重要的部分):

            begin
              puts "Searching with " + gsquery
                results = GoogleAjax::Search.web(gsquery)[:results]
                if results.count > 0
                  puts "#{results.count} results found for #{page.name}. Registering the connection!"
                end
            rescue
                puts "Try again in 3 sec"
                sleep 3
                retry
            rescue Timeout::Error 
              puts "Timeout Error, sleep 15 sec"
              sleep 15
              retry
            end
4

2 回答 2

2

Sorry, but I think you're out of luck. GoogleAjax uses the now deprecated web search API (it's been deprecated for over a year now), which may disappear at any point in the future, making the gem useless. Secondly, both the web search API and it's replacement are limited to a maximum number of queries a day, beyond which the service will just stop responding - it's 100 queries a day for the custom search API. To get more than that you'll have to pay (the rate is $5 / 1000 searches). The rate limit is based on the number of queries associated with a single API key.

I'd suggest that you:

  1. Use the google-api-client gem instead of GoogleAjax (it uses the Custom Web Search API which replaces the web search API)
  2. Get an API key for the custom search API using Google's API console
  3. Consider enabling billing. Half a cent per search is not terrible, and for several thousand searches will only cost you $10
于 2012-01-06T14:51:48.030 回答
0

我发现这个整洁的小宝石在我的最新项目中非常方便。Ruby - 谷歌搜索 API

这是一个搜索图像的简单用例。这基本上表明,如果项目的名称不等于空字符串,则使用项目名称返回前 5 个图像的搜索。如果项目的名称等于一个空字符串并因此为 nil,则什么也不做。

- if item.name != "" 
  - Google::Search::Image.new(:query => item.name).first(5).each do |image|
    = image_tag(image.uri)
于 2014-01-23T03:02:06.180 回答