python - 将 xpath 的结果存储到变量中，以帮助将来查询

Question

我正在使用 Scrapy 抓取一些网站。我是 Scrapy 和 XPath 的新手。这个问题在 XPpath 上。

如问题标题中所述，我想将所选节点存储在变量中。我想进一步查询，但不是整个 html。我只想查询加载的变量。所以让我解释一下会发生什么

让示例 html 页面为：

<sample>
    <tag attribute="I NEED THIS">
        <common1>
            Area to be processed first 
        </common1>
        <common2>
            Area to be processed later
        </common2>  
    </tag>  

    <tag attribute="I DON'T NEED THIS">  
        <common1>
            Not interested in this part    
        </common1>
        <common2>
            Again not interested here
        </common2>
    </tag>
</sample>

现在我想处理带有属性“我需要这个”的“标签”

所以我这样做：

hxs = HtmlXPathSelector(response)

needed = hxs.select('//sample/tag[@attribute="I NEED THIS"]')

稍后当我执行以下操作时：

common1 = needed.select('//common1')

我common1不仅从需要的变量中获得了整个文档中存在的两个标签元素。我在这里需要一些帮助。

score 3 · Accepted Answer

3

您需要使用相对 xpath：

.//common1

请参阅在 scrapy 文档中使用相对 XPath 。

于 2013-10-11T09:14:05.413 回答

python - 将 xpath 的结果存储到变量中，以帮助将来查询

1 回答 1

Related

Reference