0

我有一个纵向数据集,其中每个主题都代表不止一次。一位代表一位患者入院。每次录取,不论科目也都有唯一的“钥匙”。我需要弄清楚哪个录取是“INDEX”录取,即第一次录取,以便我知道哪些行是后续的RE-admission。要使用的变量是“Daystoevent”;最小的数字代表 INDEX 录取。我想基于以下条件创建一个新变量这无需更改为水平格式。

数据集如下所示:

Subject Daystoevent Key A 5 rtwe A 8 erer B 3 tter B 8 qgfb A 2 sada C 4 ccfw D 7 mjhr B 4 sdfw C 1 srtg C 2 xcvs D 3 muyg

将不胜感激一些帮助。

4

1 回答 1

0

这可能不是一个优雅的解决方案,但可以完成工作:

library(dplyr)

df <- df %>%
  group_by(Subject) %>%
  arrange(Subject, Daystoevent) %>%
  mutate(
    Admission = if_else(Daystoevent == min(Daystoevent), 0, 1),
  ) %>%
  ungroup()

for(i in 1:(nrow(df) - 1)) {
  if(df$Admission[i] == 1) {
    df$Admission[i + 1] <- 2
  } else if(df$Admission[i + 1] != 0){
    df$Admission[i + 1] <- df$Admission[i] + 1
  }
}

df[df == 0] <- "index"

df
# # A tibble: 11 x 4
#    Subject Daystoevent Key   Admission
#    <chr>         <dbl> <chr> <chr>    
#  1 A                 2 sada  index    
#  2 A                 5 rtwe  1        
#  3 A                 8 erer  2        
#  4 B                 3 tter  index    
#  5 B                 4 sdfw  1        
#  6 B                 8 qgfb  2        
#  7 C                 1 srtg  index    
#  8 C                 2 xcvs  1        
#  9 C                 4 ccfw  2        
# 10 D                 3 muyg  index    
# 11 D                 7 mjhr  1

数据:

df <- data_frame(
  Subject = c("A", "A", "B", "B", "A", "C", "D", "B", "C", "C", "D"),
  Daystoevent = c(5, 8, 3, 8, 2, 4, 7, 4, 1, 2, 3),
  Key = c("rtwe", "erer", "tter", "qgfb", "sada", "ccfw", "mjhr", "sdfw", "srtg", "xcvs", "muyg")
)
于 2018-07-29T05:12:19.263 回答