我有以下代码,效果很好:
rows = diary_HTML.xpath('//*[@id="main"]/div[2]/table/tbody/tr')
food_diary = rows.collect do |row|
detail = {}
[
["Food", 'td[1]/text()'],
["Calories", 'td[2]/text()'],
["Carbs", 'td[3]/text()'],
["Fat", 'td[4]/text()'],
["Protein", 'td[5]/text()'],
["Cholest", 'td[6]/text()'],
].each do |name, xpath|
detail[name] = row.at_xpath(xpath).to_s.strip
end
detail
end
但是,“食物”td 不仅包括文本,还包括我想从中获取文本的链接。
我知道我可以用它'td[1]/a/text()'
来获取链接文本,但我该怎么做呢?
'td[1]/a/text()' or 'td[1]/text()'
已编辑 - 添加了代码段。
我试图<tr class="meal_header">
<td class="first alt">Breakfast</td>
在第一行包含所有行,其他行包含其他常规 tds,同时不包括底行的 td1。
<tr class="meal_header">
<td class="first alt">Breakfast</td>
<td class="alt">Calories</td>
<td class="alt">Carbs</td>
<td class="alt">Fat</td>
<td class="alt">Protein</td>
<td class="alt">Sodium</td>
<td class="alt">Sugar</td>
</tr>
<tr>
<td class="first alt">
<a onclick="showEditFood(3992385560);" href="#">Hovis (Uk - White Bread (40g) Toasted With Flora Light Marg, 2 slice</a> </td>
<td>262</td>
<td>36</td>
<td>9</td>
<td>7</td>
<td>0</td>
<td>3</td>
</tr>
<tr class="bottom">
<td class="first alt" style="z-index: 10">
<a href="/food/add_to_diary?meal=0" class="add_food">Add Food</a>
<div class="quick_tools">
<a href="#quick_tools_0" class="toggle_diary_options">Quick Tools</a>
<div id="quick_tools_0" class="quick_tools_options hidden">
<ul>
<li><a onclick="showLightbox(200, 250, '/food/quick_add?meal=0&date=2013-04-15'); return false;">Quick add calories</a></li>
<li><a href="/meal/new?meal=0">Remember meal</a></li>
<li><a href="/food/copy_meal?date=2013-04-15&from_date=2013-04-14&meal=0&username=nickwild1">Copy yesterday</a></li>
<li><a href="#recent_meals_0" class="toggle_diary_options">Copy from date</a></li>
<li><a href="#recent_meals_copy_to_0" class="toggle_diary_options">Copy to date</a></li>
</ul>
</div>
<div id="recent_meals_0" class="recent_meal_options hidden">
<ul id="recent_meal_options_0">
<li class="header">Copy from which date?</li>
<li><a href="/food/copy_meal?date=2013-04-15&from_date=2013-04-14&meal=0&username=nickwild1">Sunday, April 14</a></li>
<li><a href="/food/copy_meal?date=2013-04-15&from_date=2013-04-13&meal=0&username=nickwild1">Saturday, April 13</a></li>
</ul>
</div>
</div>
</td>
<td>285</td>
<td>39</td>
<td>9</td>
<td>10</td>
<td>0</td>
<td>3</td>
<td></td>