python - Python：为什么以下 xpath 返回空列表？

Question

我正在尝试从中提取一些文本和链接instapaper.com。所以我使用以下代码来完成工作：

>>> import lxml.html as lh
>>> doc = lh.parse("http://www.instapaper.com/u/folder/1227370/programming")
>>> text = doc.xpath(".//*[@id='bookmark_list']/*/div[3]/a/text()")
>>> len(text)
0
>>> text
[]

如您所见，它返回一个空列表，这意味着它无法找到与上述 xpath 匹配的任何文本。

xpath expr现在，当我在 firebug/firepath 中使用上述内容时，它可以正常工作。

在此处输入图像描述

您可以在上面显示的图像中看到40 matching nodes。

所以，我的问题是为什么上面的 xpath 表达式不适用于 python/lxml。

根据要求的 Instapaper 页面源

score 5 · Accepted Answer

没有 ID 的元素bookmark_list。也许您必须登录。

编辑

解析它工作的真实HTML：

doc = lh.parse("http://pastebin.com/raw.php?i=1WpFAfCt")
text = doc.xpath("//*[@id='bookmark_list']/*/div[3]/a/text()")
len(text) # => 40

python - Python：为什么以下 xpath 返回空列表？

1 回答 1

Related

Reference