I'm trying to retrieve the Medals Table inside Wikipedia for Olympics 2012.
library(rvest)
library(magrittr)
url <- "https://en.wikipedia.org/wiki/United_States_at_the_2012_Summer_Olympics"
xpath0 <- '//*[@id="mw-content-text"]/table[1]'
xpath1 <- '//*[@id="mw-content-text"]/table[2]'
xpath2 <- '//*[@id="mw-content-text"]/table[2]/tbody/tr/td[1]'
xpath3 <- '//*[@id="mw-content-text"]/table[2]/tbody/tr/td[1]/table'
tb <- url %>%
html() %>%
html_nodes(xpath=xpath0) %>%
html_nodes("") %>%
html_table()
xpath0 or xpath1 return an error
Error in parse_simple_selector(stream) :
Expected selector, got <EOF at 1>
xpath2 and xpath3 return empty lists.
At same time I tried to use Selectorgadget (https://cran.r-project.org/web/packages/rvest/vignettes/selectorgadget.html) to point to the exact element. I got
//td[(((count(preceding-sibling::) + 1) = 1) and parent::)] | //*[contains(concat( " ", @class, " " ), concat( " ", "headerSortDown", " " ))]
and the Error
Error in parse_simple_selector(stream) : Expected selector, got
I really appreciate any help.
Joa