0

我正在努力抓取以下网站:

   http://www.crowdrise.com/waterforpeople-SE

如果你看一下这个网站,在右侧,在黑色按钮的正上方Fundraise for this campaign,有一个声明说:52% Raised of $20,000 Goal。我试图提取我刚才提到的这个陈述。

对于 xpath 表达式,我尝试过:

  .//*[@id="thebody"]/div[6]/div/div/div[2]/div[2]/div[2]/div/p/span

但它没有用......

什么是正确的 xpath 表达式?

谢谢你,

4

1 回答 1

1

试试这个:

> library(XML)
> doc <- htmlTreeParse('http://www.crowdrise.com/waterforpeople-SE', useInternalNodes = TRUE)
> xpathApply(doc, '//div[@class="grid1-4"]//p[@class="progressText"]')
[[1]]
<p class="progressText">
  <span>52% Raised of $20,000 Goal</span>
</p> 

attr(,"class")
[1] "XMLNodeSet"

或者直接获取文本值:

> xpathApply(doc, '//div[@class="grid1-4"]//p[@class="progressText"]', xmlValue)
[[1]]
[1] "52% Raised of $20,000 Goal"
于 2013-10-19T20:33:48.793 回答