看原因:
require 'rubygems'
require 'nokogiri'
require 'open-uri'
url = "http://www.maxpreps.com/high-schools/fitzgerald-hurricanes-(fitzgerald,ga)/football/schedule.htm"
doc = Nokogiri::HTML(open(url))
games = doc.css('.dual-contest')
games.each do |game|
puts gamedate = game.css(".event-date").xpath('@title').empty?
end
# >> true
# >> false
# >> false
# >> false
# >> false
# >> false
# >> false
# >> false
# >> false
# >> false
# >> false
换一种方式看,有一个表数据,它具有nil
价值:
require 'rubygems'
require 'nokogiri'
require 'open-uri'
url = "http://www.maxpreps.com/high-schools/fitzgerald-hurricanes-(fitzgerald,ga)/football/schedule.htm"
doc = Nokogiri::HTML(open(url))
games = doc.at_css('.dual-contest').at_css(".event-date").at_xpath('@title')
puts games
# ~> -:6:in `<main>': undefined method `at_xpath' for nil:NilClass (NoMethodError)
我会这样:-
require 'rubygems'
require 'nokogiri'
require 'open-uri'
url = "http://www.maxpreps.com/high-schools/fitzgerald-hurricanes-(fitzgerald,ga)/football/schedule.htm"
doc = Nokogiri::HTML(open(url))
doc.css('#schedule .event-date').each do |nd|
dt = nd['title']
p dt,DateTime.parse(dt)
end
# >> "2013-08-24T02:30:00"
# >> #<DateTime: 2013-08-24T02:30:00+00:00 ((2456529j,9000s,0n),+0s,2299161j)>
# >> "2013-09-07T02:30:00"
# >> #<DateTime: 2013-09-07T02:30:00+00:00 ((2456543j,9000s,0n),+0s,2299161j)>
# >> "2013-09-14T02:30:00"
# >> #<DateTime: 2013-09-14T02:30:00+00:00 ((2456550j,9000s,0n),+0s,2299161j)>
# >> "2013-09-21T02:30:00"
# >> #<DateTime: 2013-09-21T02:30:00+00:00 ((2456557j,9000s,0n),+0s,2299161j)>
# >> "2013-09-28T02:30:00"
# >> #<DateTime: 2013-09-28T02:30:00+00:00 ((2456564j,9000s,0n),+0s,2299161j)>
# >> "2013-10-05T02:30:00"
# >> #<DateTime: 2013-10-05T02:30:00+00:00 ((2456571j,9000s,0n),+0s,2299161j)>
# >> "2013-10-12T02:30:00"
# >> #<DateTime: 2013-10-12T02:30:00+00:00 ((2456578j,9000s,0n),+0s,2299161j)>
# >> "2013-10-19T02:30:00"
# >> #<DateTime: 2013-10-19T02:30:00+00:00 ((2456585j,9000s,0n),+0s,2299161j)>
# >> "2013-10-26T02:30:00"
# >> #<DateTime: 2013-10-26T02:30:00+00:00 ((2456592j,9000s,0n),+0s,2299161j)>
# >> "2013-11-02T02:30:00"
# >> #<DateTime: 2013-11-02T02:30:00+00:00 ((2456599j,9000s,0n),+0s,2299161j)>