我的应用程序正在连接网页,读取链接并抓取链接中的网页。
Main WEB_PAGE --> LINK (sub_web) ---> 抓取信息(标题)
scraper = Scraper.define do
array :items
process "div.mozaique>div", :items => Scraper.define {
process "div.thumb>a", :link => "@href"
result :link
}
result :items
end
scraper_sw = Scraper.define do #this is the subweb
array :subitems
process "div#main", :subitems => Scraper.define {
process "div#main>h1>h2", :titleweb => :text
result :titleweb
}
result :subitems
end
uri = URI.parse(URI.encode(web))
scraper.scrape(uri).each do |pag|
link_subweb = uri + pag.link.to_str
savedata_array = JPG.new(:link_web => link_subweb.to_s,
:source => "server-1"
)
uri_sw = URI.parse(URI.encode(link_subweb.to_s))
scraper_sw.scrape(uri_sw).each do |subpag|
savedata_subweb_array = JPG.new(:title => subpag.titleweb)
end
end
出于某种原因,标题被读取,但我有这个输出,(第 89 行是 scraper_sw.scrape(uri_sw).each do |subpag| )
e $stdout.sync=true;$stderr.sync=true;load($0=ARGV.shift) connect_serv1.rb
/Users/sss/web/connect_serv1.rb:93:in `block (2 levels) in <class:JPG>': undefined method `titleweb' for "Title 1 Square picture from NYC":String (NoMethodError)
from /Users/sss/web/connect_serv1.rb:89:in `each'
from /Users/sss/web/connect_serv1.rb:89:in `block in <class:JPG>'
from /Users/sss/web/connect_serv1.rb:77:in `each'
from /Users/sss/web/connect_serv1.rb:77:in `<class:JPG>'
from /Users/sss/web/connect_serv1.rb:18:in `<top (required)>'
from -e:1:in `load'
from -e:1:in `<main>'
Process finished with exit code 1
我非常感谢您的帮助和您的时间