0

我有这张桌子:

   record_id result date_start   date_end
1          1    pos                      
2          1        26/06/2019 28/06/2019
3          1        27/06/2019 29/06/2019
4          1        28/06/2019 30/06/2019
5          1        29/06/2019 01/07/2019
6          2    neg                      
7          2        01/07/2019 03/07/2019
8          2        02/07/2019 04/07/2019
9          2        03/07/2019 05/07/2019
10         2        04/07/2019 06/07/2019
11         2        05/07/2019 07/07/2019
12         3    pos                      
13         3        07/07/2019 09/07/2019
14         3        08/07/2019 10/07/2019

我想计算每一行的日期差异,没问题。之后我想要的是分别分析“pos”和“neg”组。但是当我有日期时,我的数据中没有结果的值。这是从 REDCap 导入的数据,带有重复工具。我使用 tidyverse,我认为 dplyr 可以提供帮助,这不是我必须做的 pivot_wider 吗?我试过了,但是没办法...

谢谢如果有人可以帮助...

4

1 回答 1

2

像这样,例如,计算每组的平均日期差?

library(tidyverse)
library(lubridate)
df %>% 
  fill(result, .direction = "down") %>% 
  filter(!is.na(date_start)) %>% 
  mutate(date_start = dmy(date_start),
         date_end = dmy(date_end)) %>% 
  group_by(result) %>% 
  summarise(mean_date_dif = mean(date_end - date_start))

#`summarise()` ungrouping output (override with `.groups` argument)
## A tibble: 2 x 2
#  result mean_date_dif
#  <chr>  <drtn>       
#1 neg    2 days       
#2 pos    2 days 

数据

df <- tibble::tribble(
        ~record_id, ~result,  ~date_start,    ~date_end,
                1L,   "pos",           NA,           NA,
                1L,      NA, "26/06/2019", "28/06/2019",
                1L,      NA, "27/06/2019", "29/06/2019",
                1L,      NA, "28/06/2019", "30/06/2019",
                1L,      NA, "29/06/2019", "01/07/2019",
                2L,   "neg",           NA,           NA,
                2L,      NA, "01/07/2019", "03/07/2019",
                2L,      NA, "02/07/2019", "04/07/2019",
                2L,      NA, "03/07/2019", "05/07/2019",
                2L,      NA, "04/07/2019", "06/07/2019",
                2L,      NA, "05/07/2019", "07/07/2019",
                3L,   "pos",           NA,           NA,
                3L,      NA, "07/07/2019", "09/07/2019",
                3L,      NA, "08/07/2019", "10/07/2019"
        )
于 2020-06-13T10:01:56.087 回答