0

这是head(both$stterm)

 stterm
1 2011-01-19
2 2012-01-19
3 2007-09-01
4 2011-09-01
5 2008-09-01
6 2013-09-01

正如我所说,这只是数据集的一部分,我有 4021 个观察值。我想创建一个新列,每个日期是否代表一个值,如下所示。

变量应该是连续的。

我已经测试了 as.date 但我刚刚得到一个充满 NULL 的列。

重要的是 2008-09-01 = 8 而不是 08

"2007-09-01"=7,
"2008-09-01"=8,
"2009-01-19"=9,
"2009-09-01"=9,
"2010-01-19"=10,
"2010-09-01"=10,
"2011-01-19"=11,
"2011-09-01"=11,
"2012-01-19"=12,
"2012-09-01"=12,
"2013-01-19"=13,
"2013-09-01"=13,
"2014-01-19"=14)

所以我想做的只是用数字而不是实际日期创建一个列。新变量将被调用:calenderyear.

我需要有关如何在 R 中编写此内容的提示

4

4 回答 4

1

您可以按如下方式执行此操作:

require(lubridate)
dat$year <- year(as.Date(dat$stterm))-2000

结果:

> dat
      stterm year
1 2011-01-19   11
2 2012-01-19   12
3 2007-09-01    7
4 2011-09-01   11
5 2008-09-01    8
6 2013-09-01   13

数据:

dat <- read.table(header = TRUE, stringsAsFactors = FALSE, text = " stterm
1 2011-01-19
2 2012-01-19
3 2007-09-01
4 2011-09-01
5 2008-09-01
6 2013-09-01")
于 2015-04-14T14:47:17.050 回答
1

试试lubridate图书馆

install.packages(lubridate)
library(lubridate)
year(ymd(both$stterm))-2000
于 2015-04-14T14:47:47.367 回答
1

你可以试试这个

d <- as.Date(c("2007-09-01", "2008-09-01", "2009-01-19", "2009-09-01", "2010-01-19", "2010-09-01", "2011-01-19", "2011-09-01", "2012-01-19", "2012-09-01", "2013-01-19", "2013-09-01", "2014-01-19"), format="%Y-%m-%d")
sub("^0", "", sub("[[:digit:]]{2}([[:digit:]]{2}).*", "\\1", d))
 [1] "7"  "8"  "9"  "9"  "10" "10" "11" "11" "12" "12" "13" "13" "14"
于 2015-04-14T14:48:27.990 回答
1

您可以尝试使用 base R 执行此操作:首先重现数据集的子集:

both <- data.frame( stterm=as.Date(c('2011-01-19','2012-01-19', '2007-09-01','2011-09-01','2008-09-01','2013-09-01')))

both
      stterm
1 2011-01-19
2 2012-01-19
3 2007-09-01
4 2011-09-01
5 2008-09-01
6 2013-09-01

both$calenderyear <- as.numeric(format(both$stterm,"%y"))
both
      stterm calenderyear
1 2011-01-19           11
2 2012-01-19           12
3 2007-09-01            7
4 2011-09-01           11
5 2008-09-01            8
6 2013-09-01           13
于 2015-04-14T14:58:44.863 回答