其他两种方法:ave
并使用plyr
库:
df <-
structure(list(id = c("A", "A", "A", "B", "B", "C", "C"), date = structure(c(10969,
10974, 10981, 15623, 15624, 12989, 12995), class = "Date")), .Names = c("id",
"date"), row.names = c(NA, -7L), class = "data.frame")
使用ave
,日期必须更改为数字
df$days_from_start <- ave(as.numeric(df$date), df$id, FUN = function(x) x-min(x))
这使
> df
id date days_from_start
1 A 2000-01-13 0
2 A 2000-01-18 5
3 A 2000-01-25 12
4 B 2012-10-10 0
5 B 2012-10-11 1
6 C 2005-07-25 0
7 C 2005-07-31 6
> str(df)
'data.frame': 7 obs. of 3 variables:
$ id : chr "A" "A" "A" "B" ...
$ date : Date, format: "2000-01-13" ...
$ days_from_start: num 0 5 12 0 1 0 6
使用plyr
库:
library("plyr")
df <- ddply(df, .(id), mutate, days_from_start = date - min(date))
这使
> df
id date days_from_start
1 A 2000-01-13 0 days
2 A 2000-01-18 5 days
3 A 2000-01-25 12 days
4 B 2012-10-10 0 days
5 B 2012-10-11 1 days
6 C 2005-07-25 0 days
7 C 2005-07-31 6 days
> str(df)
'data.frame': 7 obs. of 3 variables:
$ id : chr "A" "A" "A" "B" ...
$ date : Date, format: "2000-01-13" ...
$ days_from_start:Class 'difftime' atomic [1:7] 0 5 12 0 1 0 6
.. ..- attr(*, "units")= chr "days"