0

下面的 MWE 代码按预期工作。总之:

  • 第一个data1 <- ...mutate(...)添加一个新列“minusD”,计算为(i)当前行“plusB”值+(ii)前一行“PlusB”值,如果从一行移动到下一行时id相同(否则为0) , 和
  • 第二个data1 <- ...mutate(...)添加了一个“running_balance”列,该列为cumsum()共享相同 id 的所有行计算 a。

但是,当在更完整的代码中部署它时,由于运行了两个data1 <- ...进程,在运行另一个从这个“data1”数据帧的等效项中提取的表时出现错误。那么,如何将这两个功能合二为一呢?

带有计算的输出解释:

     id plusA plusB minusC minusD running_balance [explain calculations ...]
     1     3     5     10      5              -7   minus D = plusB, running bal = plusA + plusB - minusC - minusD
     2     4     5      9      5              -5   same formulas as above since id <> prior row id
     3     8     5      8      5               0   same formulas as above since id <> prior row id
     3     1     4      7      9             -11   since id = prior row id, minus D = plusB + prior row plus B, and running bal = running bal from prior row + plusA + plusB - minusC - minusD  
     3     2     5      6      9             -19   same formulas as above since id = prior row id
     5     3     6      5      6              -2   minus D = plusB, running bal = plusA + plusB - minusC - minusD

MWE代码:

data <- data.frame(id=c(1,2,3,3,3,5), 
                   plusA=c(3,4,8,1,2,3), 
                   plusB=c(5,5,5,4,5,6),
                   minusC = c(10,9,8,7,6,5))

library(dplyr)

data1<- subset(
  data %>% mutate(extra=case_when(id==lag(id)~lag(plusB),TRUE ~ 0)) %>%
    mutate(minusD=plusB+extra),
  select = -c(extra) # remove temporary calculation column 
) 

data1 <- data1 %>% group_by(id) %>% mutate(running_balance = cumsum(plusA + plusB - minusC - minusD))
4

1 回答 1

2

您可以继续链%>%而不是创建临时对象。

library(dplyr)

data %>% 
  mutate(extra=case_when(id==lag(id)~lag(plusB),TRUE ~ 0),
         minusD=plusB+extra) %>%
  group_by(id) %>%
  mutate(running_balance = cumsum(plusA + plusB - minusC - minusD)) %>%
  ungroup %>%
  select(-extra)

#     id plusA plusB minusC minusD running_balance
#  <dbl> <dbl> <dbl>  <dbl>  <dbl>           <dbl>
#1     1     3     5     10      5              -7
#2     2     4     5      9      5              -5
#3     3     8     5      8      5               0
#4     3     1     4      7      9             -11
#5     3     2     5      6      9             -19
#6     5     3     6      5      6              -2
于 2021-12-21T12:07:32.560 回答