4

我需要从具有这些性质的值的向量中提取开始年份和结束年份。

 yr<- c("June 2013 – Present (2 years 9 months)", "January 2012 – June 2013 (1 year 6 months)","2006 – Present (10 years)","2002 – 2006 (4 years)")


 yr
 June 2013 – Present (2 years 9 months)
 January 2012 – June 2013 (1 year 6 months)
 2006 – Present (10 years)
 2002 – 2006 (4 years)

我期待这样的输出。有人有建议吗?

 start_yr       end_yr

2013            2016
2012            2013
2006            2016
2002            2006
4

2 回答 2

5
x <- gsub("present", "2016", yr, ignore.case = TRUE)
x <- regmatches(x, gregexpr("\\d{4}", x))
start_yr <- sapply(x, "[[", 1)
end_yr <- sapply(x, "[[", 2)

这将开始年份和结束年份保存在 2 个单独的变量中,如果您希望将它们放在一个变量中,只需编辑代码并制作 y$start_yr y$end_yr

于 2016-02-29T22:14:15.643 回答
1

另一种解决方案是使用stringr

library(stringr)
x <- str_replace(yr, "Present", 2016)
DF <- as.data.frame(str_extract_all(x, "\\d{4}", simplify = T))
names(DF) <- c("start_yr", "end_yr")
DF

你会得到

      start_yr end_yr
1     2013   2016
2     2012   2013
3     2006   2016
4     2002   2006
于 2016-02-29T22:57:00.593 回答