2

提前道歉,我相信这很简单,但我无法弄清楚我做错了什么..

除了其他的东西,这段代码..

study.name <- 'NLSY79'
library(XML)
library(httr)
sub.study <- paste0( "https://www.nlsinfo.org/investigator/servlet1?get=SUBSTUDIES&study=" , study.name )
study.html <- GET( sub.study )
content( study.html )
study.block <- htmlParse( study.html , asText = TRUE )

..给我..

$children$html
<html>
 <body>
  <p>
   false
   <select id="thesubstudies" onchange="onSubstudyChanged(this);">
    <option value="-1" selected="selected">(Choose One)</option>
    <option value="343.06">NLSY79 (1979-2010)</option>
   </select>
  </p>
 </body>
</html>

我只是想要一种快速(自动)的方法来提取“343.06”

谢谢!

4

2 回答 2

3

你可以xpathSApply用来提取你想要的元素

xpathSApply(study.block, "//option")
# [[1]]
# <option value="-1" selected="selected">(Choose One)</option> 
# [[2]]
# <option value="343.06">NLSY79 (1979-2010)</option> 

并对它们应用一个函数(xmlValuexmlAttrs,取决于上下文)。

xpathSApply(study.block, "//option", function(u) xmlAttrs(u)["value"])
#   value    value 
#    "-1" "343.06" 
于 2013-09-30T17:27:58.710 回答
1

你也可以使用xmlGetAtrr

xpathSApply(study.block, "//option", xmlGetAttr, "value")
[1] "-1"     "343.06"

或者

xpathSApply(study.block, "//option[not(@selected)]", xmlGetAttr, "value")
[1] "343.06"
于 2013-09-30T22:49:42.420 回答