我有面板数据(主题/年份),我只想保留每年出现最多次数的主题。数据集很大,所以我使用的是 data.table 包。有没有比我在下面尝试过的更优雅的解决方案?
library(data.table)
DT <- data.table(SUBJECT=c(rep('John',3), rep('Paul',2),
rep('George',3), rep('Ringo',2),
rep('John',2), rep('Paul',4),
rep('George',2), rep('Ringo',4)),
YEAR=c(rep(2011,10), rep(2012,12)),
HEIGHT=rnorm(22),
WEIGHT=rnorm(22))
DT
DT[, COUNT := .N, by='SUBJECT,YEAR']
DT[, MAXCOUNT := max(COUNT), by='YEAR']
DT <- DT[COUNT==MAXCOUNT]
DT <- DT[, c('COUNT','MAXCOUNT') := NULL]
DT