0

我的应用程序正在连接网页,读取链接并抓取链接中的网页。

Main WEB_PAGE --> LINK (sub_web) ---> 抓取信息(标题)

scraper = Scraper.define do
    array :items

    process "div.mozaique>div", :items  => Scraper.define {

      process "div.thumb>a", :link => "@href"
      result :link
    }
    result :items
  end

  scraper_sw = Scraper.define do #this is the subweb
        array :subitems
        process "div#main", :subitems   => Scraper.define {
        process "div#main>h1>h2", :titleweb => :text
        result :titleweb
      }
    result :subitems
  end


  uri = URI.parse(URI.encode(web))

  scraper.scrape(uri).each do |pag|

    link_subweb = uri + pag.link.to_str


    savedata_array = JPG.new(:link_web => link_subweb.to_s,
                            :source => "server-1"
                            )

    uri_sw = URI.parse(URI.encode(link_subweb.to_s))

    scraper_sw.scrape(uri_sw).each do |subpag|

       savedata_subweb_array = JPG.new(:title => subpag.titleweb)

    end

  end

出于某种原因,标题被读取,但我有这个输出,(第 89 行是 scraper_sw.scrape(uri_sw).each do |subpag| )

e $stdout.sync=true;$stderr.sync=true;load($0=ARGV.shift) connect_serv1.rb
/Users/sss/web/connect_serv1.rb:93:in `block (2 levels) in <class:JPG>': undefined method `titleweb' for "Title 1 Square picture from NYC":String (NoMethodError)
    from /Users/sss/web/connect_serv1.rb:89:in `each'
    from /Users/sss/web/connect_serv1.rb:89:in `block in <class:JPG>'
    from /Users/sss/web/connect_serv1.rb:77:in `each'
    from /Users/sss/web/connect_serv1.rb:77:in `<class:JPG>'
    from /Users/sss/web/connect_serv1.rb:18:in `<top (required)>'
    from -e:1:in `load'
    from -e:1:in `<main>'

Process finished with exit code 1

我非常感谢您的帮助和您的时间

4

0 回答 0