我在解决这个问题时遇到了问题:假设这就是我的数据的样子:
Num condition y
1 a 1
2 a 2
3 a 3
4 b 4
5 b 5
6 b 6
7 c 7
8 c 8
9 c 9
10 b 10
11 b 11
12 b 12
我现在想对 b 进行计算(例如,平均值),这取决于值是否在 b 之前的行中,在这个例子中是 a 还是 c?谢谢你的帮助!!!安吉莉卡
这是你想要的吗?
# in order to separate between different runs of condition 'b',
# get length and value of runs of equal values of 'condition'
rl <- rle(x = df$condition)
df$run <- rep(x = seq_len(length(rl$lengths)), times = rl$lengths)
# calculate sum of y, on data grouped by condition and run, and where condition is 'b'
aggregate(y ~ condition + run, data = df, subset = condition == "b", sum)
您可以使用添加“滞后”条件列到您的数据框(假设DF
)
> DF <- within(DF, lag_cond <- c(NA, head(as.character(condition), -1)))
结果:
Num condition y lag_cond
1 a 1 <NA>
2 a 2 a
3 a 3 a
4 b 4 a
5 b 5 b
6 b 6 b
7 c 7 b
8 c 8 c
9 c 9 c
10 b 10 c
11 b 11 b
12 b 12 b
现在您可以像这样识别您想要的行:
> DF[with(DF, condition=="b" & lag_cond %in% c("a","c")),]
Num condition y lag_cond
4 b 4 a
10 b 10 c