0

这是我创建的示例代码。我已经能够创建一个新链接,但我很困惑如何跟随链接并从跟随的链接中抓取数据。

library(tidyverse)
library(rvest)
library(xml2)

url<-"https://www.indeed.com/jobs?q=data%20analyst&l=San%20Diego%2C%20CA&vjk=0c2a6008b4969776"
page<-xml2::read_html(url)#function will read in the code from the webpage and break it down into different elements (<div>, <span>, <p>, etc.

#get job title
title<-page %>%
  html_nodes(".jobTitle") %>%
  html_text()
  
#get company Location
loc<-page %>%
  html_nodes(".companyLocation") %>%
  html_text()

#job snippet
page %>%
  html_nodes(".job-snippet") %>%
  html_text()

#Get link 
desc<- page %>%
  html_nodes("a[data-jk]") %>%
  html_attr("href") 

# Create combine link 
combined_link <- paste("https://www.indeed.com", desc, sep="")

我如何跟踪合并的链接并从新页面中抓取数据,是否可以在不使用函数的情况下做到这一点?

4

0 回答 0