8

在 中的j参数中data.table,是否有语法允许我在同一j语句中引用以前创建的变量?我正在考虑类似 Lisp 的let*构造。

library(data.table)
set.seed(22)
DT <- data.table(a = rep(1:5, each = 10),
                 b = sample(c(0,1), 50, rep = TRUE))

DT[ ,
   list(attempts = .N,
        successes = sum(b),
        rate = successes / attempts),
   by = a]

这导致

# Error in `[.data.table`(DT, , list(attempts = .N, successes = sum(b),  : 
#  object 'successes' not found

我明白为什么,但是有没有不同的方法可以做到这一点j

4

2 回答 2

8

这可以解决问题:

DT[ , {
    list(attempts = attempts <- .N,
         successes = successes <- sum(b),
         rate = successes/attempts)
    },  by = a]
#    a attempts successes rate
# 1: 1       10         5  0.5
# 2: 2       10         6  0.6
# 3: 3       10         3  0.3
# 4: 4       10         5  0.5
# 5: 5       10         5  0.5

FWIW,这个密切相关的 data.table功能请求将使您的问题中使用的语法成为可能。从链接页面引用:

概括:

:=(and )的迭代 RHS `:=`(...),以及多个:=内部j = {...}语法

详细说明

例如DT[, `:=`( m1 = mean(a), m2 = sd(a), s = m1/m2 ), by = group]

其中 s 可以使用以前的 lhs 名称(使用“迭代”一词试图传达这一点)。

于 2013-05-16T16:31:31.887 回答
5

试试这个:

DT[,
   {successes = sum(b);
    attempts  = .N;
    list(attempts = attempts,
         successes = successes,
         rate = successes / attempts)
   },
   by = a]

或者

DT[,
   list(attempts = .N,
        successes = sum(b)),
   by = a][, rate := successes / attempts]
于 2013-05-16T16:31:40.040 回答