python - xpath：如何编写条件 xpath？

Question

我正在尝试从以下两个页面中提取价格信息：

http://jujumarts.com/mobiles-accessories-smartphones-wildfire-sdarkgrey-p-551.html http://jujumarts.com/computers-accessories-transcend-500gb-portable-storejet-25d2-p-2616.html

xpath1 = //span[@class='productSpecialPrice']//text()
xpath2 = //div[@class='proDetPrice']//text()

到目前为止，我已经编写了 python 代码，如果成功则返回 xpath1 的结果，否则执行第二个。我有一种感觉，可以单独在 xpath 中实现这个逻辑，有人可以告诉我怎么做吗？

score 4 · Accepted Answer

用于|表示union：

xpath3 = "//span[@class='productSpecialPrice']//text()|//div[@class='proDetPrice']//text()"

这不完全是您所要求的，但我认为它可以合并到一个可行的解决方案中。

来自XPath（1.0 版）规范：

该| 运算符计算其操作数的并集，它必须是节点集。

例如，

import lxml.html as LH

urls = [
    'http://jujumarts.com/mobiles-accessories-smartphones-wildfire-sdarkgrey-p-551.html',
    'http://jujumarts.com/computers-accessories-transcend-500gb-portable-storejet-25d2-p-2616.html'
    ]

xpaths = [
    "//span[@class='productSpecialPrice']//text()",
    "//div[@class='proDetPrice']//text()",
    "//span[@class='productSpecialPrice']//text()|//div[@class='proDetPrice']//text()"
    ]
for url in urls:
    doc = LH.parse(url)
    for xpath in xpaths:
        print(doc.xpath(xpath))
    print

产量

['Rs.11,800.00']
['Rs.13,299.00', 'Rs.11,800.00']
['Rs.13,299.00', 'Rs.11,800.00']

[]
['Rs.7,000.00']
['Rs.7,000.00']

获取所需信息的另一种方法是

"//*[@class='productSpecialPrice' or @class='proDetPrice']//text()"

python - xpath：如何编写条件 xpath？

1 回答 1

Related

Reference