使用带有 FUN 的基本 R 函数= 一个简单的自定义函数,以在一个步骤中返回两个输出的aggregate()向量:min()max()
正如您所建议的,您可以使用- 但如下所示,aggregate()您可以一步完成计算每个组min()max()patid
# Read in your sample data, being careful to prevent dates from becoming factors
pdates <-
read.table( text="patid date
1302 2009-01-27
1302 2009-02-05
1302 2009-08-28
1670 2009-03-12
2073 2009-04-03
2073 2010-11-01
2073 2010-12-19
2073 2011-03-06",
header=TRUE,
stringsAsFactors=FALSE) # keep date strings from becoming factors!
aggregate( x = pdates["date"], # dataframe with column(s) to aggregate
by = pdates["patid"], # passing dataframe with named column "patid" preserves the column name in the output
FUN = function(vdate) {
c(start=min(vdate), end=max(vdate))
}
)
patid date.start date.end
1 1302 2009-01-27 2009-08-28
2 1670 2009-03-12 2009-03-12
3 2073 2009-04-03 2011-03-06
编辑:或者,更简单地使用非常有用的基本 Rrange()函数:
aggregate( pdates["date"], by=pdates["patid"], range)
patid date.1 date.2
1 1302 2009-01-27 2009-08-28
2 1670 2009-03-12 2009-03-12
3 2073 2009-04-03 2011-03-06