我有这个查询,它提取了被“喜欢”超过 5 次的帖子。
//div[@class="pin"]
[.//span[@class = "LikesCount"]
[substring-before(normalize-space(text())," ") > 5]
我想提取和存储附加信息,例如标题、img url、编号、repin 编号、...
如何将它们全部提取出来?
- 多个 XPath 查询?
- 在使用 php 和 php 函数进行迭代时挖掘结果帖子的节点?
- ...
遵循标记示例:
<div class="pin">
<p class="description">gorgeous couch <a href="#">#modern</a></p>
[...]
<div class="PinHolder">
<a href="/pin/56787645270909880/" class="PinImage ImgLink">
<img src="http://media-cache-ec3.pinterest.com/upload/56787645270909880_d7AaHYHA_b.jpg"
alt="Krizia"
data-componenttype="MODAL_PIN"
class="PinImageImg"
style="height: 288px;">
</a>
</div>
<p class="stats colorless">
<span class="LikesCount">
22 likes
</span>
<span class="RepinsCount">
6 repins
</span>
</p>
[...]
</div>