我想找出观察到的案例与未按案例类型的案例之间的区别:
set.seed(42)
df <- data.frame(type = factor(rep(c("A", "B", "C"), 2)), observed = rep(c(T,F), 3),
val1 = sample(5:1, 6, replace = T), val2 = sample(1:5, 6, replace = T),
val3 = sample(letters[1:5], 6, replace = T))
# type observed val1 val2 val3
# 1 A TRUE 1 4 e
# 2 B FALSE 1 1 b
# 3 C TRUE 4 4 c
# 4 A FALSE 1 4 e
# 5 B TRUE 2 3 e
# 6 C FALSE 3 4 a
以下代码仅在两种不同类型的情况下有效(例如levels(df$type) == c("A", "B")
,但不适用于df
上面提供的情况:
df %>%
group_by(type, observed) %>%
summarise_if(is.numeric, funs(diff(., 1)))
所需的输出是:
# type val1 val2
# A 0 0
# B -1 -2
# C -1 0