我有以下xml:
parsed <-
<div class="Matches">
<div class="Match">
<div class="MatchType">Singles Match</div>
<div class="MatchResults">
<a href="?id=2&nr=11408&name=Jason+Jordan">Jason Jordan</a> (w/<a href="?id=2&nr=2250&name=Seth+Rollins">Seth Rollins</a>) defeats <a href="?id=2&nr=257&name=Cesaro">Cesaro</a> (w/<a href="?id=2&nr=2641&name=Sheamus">Sheamus</a>) (13:15)</div>
</div>
<div class="Match">
<div class="MatchRecommended">[<span class="TextHighlight"><a href="?id=111&nr=9099">Recommended, Meltzer: ***3/4, CAGEMATCH users: <span class=" Rating Color7">7.17</span></a></span>]</div>
<div class="MatchType">
<a href="?id=5&nr=16">WWE Intercontinental Title</a> Match</div>
<div class="MatchResults">
<a href="?id=2&nr=9967&name=Roman+Reigns">Roman Reigns</a> (c) defeats <a href="?id=2&nr=676&name=Samoa+Joe">Samoa Joe</a> (24:50) </div>
我正在尝试拉出“MatchRecommended”类的部分,并为那些没有“MatchRecommended”类的孩子列出“NA”。
我想我必须使用 xpathSApply 和 xmlChildren 来提取相关数据,但是使用下面的代码,我只能得到 NA:
xpathSApply(parsed, "//*[(@class = 'Match')]", function(x) ifelse(is.null(xmlChildren(x)$a), NA, xmlAttrs(xmlChildren(x)$a, 'href')))
[1] NA NA NA NA NA NA NA
理想情况下,结果如下所示:
[1] NA "Recommended, Meltzer: ***3/4, CAGEMATCH users: 7.17"
关于如何做到这一点的任何想法?