0

我有一个简单的脚本来检查错误的 url:

def self.check_prod_links
  require 'net/http'
  results = []
  Product.find_each(:conditions =>{:published => 1}) do |product|
    url = product.url 
    id = product.id
    uri = URI(url)
    begin
      response = Net::HTTP.get_response(uri)
    rescue
      begin
        http = Net::HTTP.new(uri.host, uri.port)
        http.use_ssl = true
        http.verify_mode = OpenSSL::SSL::VERIFY_NONE
        request = Net::HTTP::Get.new(uri.request_uri)
        response = http.request(request)
      rescue
        begin
          response = Net::HTTP.get_response("http://" + uri)  
        rescue => e
          p "Problem getting url: #{url} Error Message: #{e.message}"
        end
      end
    end
    p "Checking URL = #{url}. ID = #{id}. Response Code = #{response.code}" 
    unless response.code.to_i == 200
      product.update_attribute(:published, 0) 
      results << product
    end
  end
  return results
end

我怎样才能允许格式不正确的网址,例如:hkbfksrhf.google.com 以不使脚本崩溃并出现以下错误:

getaddrinfo:提供节点名或服务名,或未知

我只想让任务运行到最后,并打印任何/所有不是 200 和 301 http 响应的错误。

谢谢!

4

1 回答 1

1

open-uri 是一种选择吗?它在遇到 404 或 500(或其他 HTTP 异常)时抛出异常,除了 SocketErrors,它允许您稍微清理一下代码

def self.check_prod_links                                            
  require 'open-uri'                                                 
  results = []                                                       

  Product.where(:published => 1).each do |product|                   
    url = product.url                                               
    id = product.id                                                  
    failed = true                                                    

    begin                                                            
      open URI(url)                                                  
      failed = false                                                 
    rescue OpenURI::HTTPError => e                                   
      error_message = e.message                                      
      response_message = "Response Code = #{e.io.status[0]}"         
    rescue SocketError => e                                          
      error_message = e.message                                      
      response_message = "Host unreachable"                          
    rescue => e                                                      
      error_message = e.message                                      
      response_message = "Unknown error"                             
    end                                                              

    if failed                                                        
      Rails.logger.error "Problem getting url: #{url} Error Message: #{error_message}"
      Rails.logger.error "Checking URL = #{url}. ID = #{id}. #{response_message}".    

      product.update_attribute(:published, 0).                       
      results << product                                             
    end                                                              
  end                                                                

  results                                                          
end                                                                  
于 2012-07-20T04:46:48.777 回答