我想从 data.table 中获取唯一的行,给定列的子集和i
. 最好的方法是什么?(在计算速度和简短或可读语法方面的“最佳”)
set.seed(1)
jk <- data.table(c1 = sample(letters,60,replace = TRUE),
c2 = sample(c(TRUE,FALSE),60, replace = TRUE),
c3 = sample(letters,60, replace = TRUE),
c4 = sample.int(10,60, replace = TRUE)
)
假设我想找到10c1
和c2
where c4
is 10 的唯一组合。我可以想到几种方法来做到这一点,但不确定什么是最佳的。要提取的列是否带键也很重要。
## works but gives an extra column
jk[c4 >= 10, TRUE, keyby = list(c1,c2)]
## this removes extra column
jk[c4 >= 10, TRUE, keyby = list(c1,c2)][,V1 := NULL]
## this seems like it could work
## but no j-expression with a keyby throws an error
jk[c4 >= 10, , keyby = list(c1,c2)]
## using unique with .SD
jk[c4 >= 10, unique(.SD), .SDcols = c("c1","c2")]