0

我是 python 新手,徒劳地搜索了 stackoverflow 以获得我能理解的答案。提前感谢您提供的任何帮助或建议。

我正在尝试从房屋销售网站上抓取有关价格和位置的信息,即带有“字段内容”标签的信息。

问题是该页面有很多“字段内容”标签,而我正在尝试的原始代码会拉出并打印出看似随机的选择。

提前感谢您的帮助。

这是我要抓取的内容:

<div class="view-content">
<div class="views-row views-row-1 views-row-odd views-row-first views-row-last">
        <div class="views-field views-field-field-summary">        
<div class="field-content">
Land for sale in Prestatyn, Flintshire. Three acres of land with outline planning permission for three large, 4 bedroomed detached houses.
</div> 
 </div>  
         <div class="views-field views-field-field-price">    
<span class="views-label views-label-field-price">PRICE: </span>   
 <span class="field-content">£297,500</span>  
</div>  

这是我试图让它给我回价格的基本尝试。还没有走得太远,像刮除价格以外的东西并将其保存到刮板维基表还有很长的路要走!

#!/usr/bin/env python

from lxml import html
import requests

page = requests.get('http://www.plotfinder.net/plot/plot-jaslin')
tree = html.fromstring(page.content)

Type1 = tree.xpath('//span[@class="views-label views-label-field-price"]/text()')
price = tree.xpath('//span[@class="field-content"]/text()')

print 'Type1: ', Type1
print 'price: ', price
4

1 回答 1

0

你可以试试这个

from lxml import html
import requests

page = requests.get('http://www.plotfinder.net/plot/plot-jaslin')
tree = html.fromstring(page.content)

Type1 = tree.xpath('//span[contains(@class,"field-price"]/text()')
price = tree.xpath('//span[contains(@class,"field-price")]/following-sibling::span[contains(@class,"field-content")][1]/text()')


print 'Type1: ', Type1
print 'price: ', price

希望你能得到你想要的结果。

于 2015-12-04T16:22:14.897 回答