0

I am trying to organize a list of links and names based on a tag that is outside of the group of where the links and name reside. It's setup like so:

<h4>Volkswagen</h4>
<ul>
   <li><a href="http://beetle.cars.com">beetle</a></li>
</ul>

<h4>Chevy</h4>
<ul>
  <li><a href="http://volt.cars.com">Volt / Electric</a></li>
</ul>

What I need is the result to be in the following format with the name as a link eventually but I can do that later if I can just get the items organized properly.

Each car brand could have multiple models of varying counts. I would need to organize them by car brand:

Volkswagen
   Beetle Link  Beetle
   Jetta Link   Jetta

Chevy
   Volt Link  Volt / Electric
   S10 Link  S10

I can get the list of brands with no problem. I am just having a hard time associating the batch of models with each brand as the <h4> tags aren't nested so I don't know how to associate them with the following <ul> list of cars.

4

1 回答 1

0

我更喜欢直接潜入每辆车,然后退出以提取汽车的品牌:

cars = Hash.new { |h, k| h[k] = [] }

doc.xpath('//ul/li/a').each do |car|
  brand = car.at('../../preceding-sibling::h4[1]').text
  cars[brand] << {link: car['href'], name: car.text}
end

请注意,哈希是使用指定默认值是数组的块初始化的。如图所示,这允许附加哈希(通过<<)。XPath../../preceding-sibling::h4[1]说:回到ul关卡并回顾前面的第一个h4. 这是汽车的相应品牌。

输出:

{"Volkswagen"=>[
                {:link=>"http://beetle.cars.com", :name=>"beetle"}
                # others here
               ],
 "Chevy"=>[
           {:link=>"http://volt.cars.com", :name=>"Volt / Electric"}
           # others here
          ]
}

我发现这种技术既好又简单,只需一个循环。虽然不是每个人都喜欢这种风格。

于 2013-05-29T02:08:24.367 回答