假设DF
在最后的注释中以可重复的方式显示,删除最后一个斜杠之前的所有内容并转换为数字:
transform(DF, year = as.numeric(sub(".*/", "", `Period of coverage`)), check.names = FALSE)
给予:
Period of coverage year
1 1/1/2011 to 31/12/2011 2011
2 1/1/2010 to 31/12/2010 2010
3 1/1/2012 to 31/12/2012 2012
4 1/1/2010 to 31/12/2010 2010
5 1/1/2011 to 31/12/2011 2011
6 1/1/2012 to 31/12/2012 2012
7 1/1/2010 to 31/12/2010 2010
8 1/1/2010 to 31/12/2010 2010
9 1/1/2009 to 31/12/2009 2009
另一种可能性是将其转换为 Date 类,首先注意as.Date
忽略最后的垃圾:
to_year <- function(x, fmt) as.numeric(format(as.Date(x, fmt), "%Y"))
transform(DF, year = to_year(`Period of coverage`, "%d/%m/%Y"), check.names = FALSE)
笔记
Lines <- " Period of coverage
1/1/2011 to 31/12/2011
1/1/2010 to 31/12/2010
1/1/2012 to 31/12/2012
1/1/2010 to 31/12/2010
1/1/2011 to 31/12/2011
1/1/2012 to 31/12/2012
1/1/2010 to 31/12/2010
1/1/2010 to 31/12/2010
1/1/2009 to 31/12/2009"
DF <- read.csv(text = Lines, check.names = FALSE, as.is = TRUE)