0

再见,

这是我的复制示例。

a=c(1,2,3,4,5,6,7,8)
b=c(1,1,0,0,0,"NA",0,"NA")
c=c(11,7,9,9,5,"NA",7,"NA")
d=c(2012,2011,2012,2014,2014,"NA",2011,"NA")
e=c(1,0,1,0,0,1,"NA","NA")
f=c(10,4,11,10,10,6,"NA","NA")
g=c(2014,2012,2010,2012,2013,2011,"NA","NA")
h=c(1,0,1,0,1,0,1,"NA")
i=c(2,12,12,6,8,11,3,"NA")
j=c(2011,2012,2011,2012,2012,2014,2012,"NA")
k=c(1,1,1,0,1,1,1,"NA")
l=c(11/1/2012,"7/1/2012","11/1/2010",0 ,"8/1/2012","6/1/2012","3/1/2012","NA")

mydata = data.frame(a,b,c,d,e,f,g,h,i,j,k,l)
names(mydata) = c("id","test1","month1","year1","test2","month2","year2","test3","month3","year3","anytest","date")

我的目标是搜索每一行并找到第一个等于 1 的测试列。我要创建的新列是“anytest”。如果 test1 或 test2 或 test3 等于 1,则此列为 1。如果它们都没有,则它等于 0。这将忽略 NA 值。如果 test1 和 test2 为 NA 但 test3 等于 0,则 anytest 等于 0。现在我我认为使用此代码取得了进展:

anytestTRY = if(rowSums(mydata[,c(test1,test2,test3)] == 1, na.rm=TRUE) > 0],1,0)

但是现在我正处于十字路口,因为我的目标是搜索每一行以找到 test1 test2 或 test3 的第一列等于 1,然后报告该测试的月份和年份。因此,如果 test1 等于 0 并且 test2 等于 NA 并且 test3 等于 1 我希望我创建的名为 date 的列具有可分析的时间格式的 month3 和 year3。太感谢了。

4

1 回答 1

0
a=c(1,2,3,4,5,6,7,8)
b=c(1,1,0,0,0,"NA",0,"NA")
c=c(11,7,9,9,5,"NA",7,"NA")
d=c(2012,2011,2012,2014,2014,"NA",2011,"NA")
e=c(1,0,1,0,0,1,"NA","NA")
f=c(10,4,11,10,10,6,"NA","NA")
g=c(2014,2012,2010,2012,2013,2011,"NA","NA")
h=c(1,0,1,0,1,0,1,"NA")
i=c(2,12,12,6,8,11,3,"NA")
j=c(2011,2012,2011,2012,2012,2014,2012,"NA")

mydata = data.frame(a,b,c,d,e,f,g,h,i,j)
names(mydata) = c("id","test1","month1","year1","test2","month2","year2","test3","month3","year3")


library(tidyverse)
library(lubridate)

mydata %>%
  mutate_all(~as.numeric(as.character(.))) %>%  # update columns to numeric
  group_by(id) %>%                              # for each id
  nest() %>%                                    # nest data
  mutate(date = map(data, ~case_when(.$test1==1 ~ ymd(paste0(.$year1,"-",.$month1,"-",1)),                # get date based on first test that is 1
                                     .$test2==1 ~ ymd(paste0(.$year2,"-",.$month2,"-",1)),
                                     .$test3==1 ~ ymd(paste0(.$year3,"-",.$month3,"-",1)))),
         anytest = map(data, ~as.numeric(case_when(sum(c(.$test1, .$test2, .$test3)==1) > 0 ~ "1",        # create anytest column
                                                   sum(is.na(c(.$test1, .$test2, .$test3))) == 3 ~ "NA",
                                                   TRUE ~ "0")))) %>%
  unnest()                                                                                                 # unnestdata

返回:

# # A tibble: 8 x 12
#      id date     anytest test1 month1 year1 test2 month2 year2 test3 month3 year3
#   <dbl> <date>     <dbl> <dbl>  <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl>  <dbl> <dbl>
# 1     1 2012-11-01     1     1     11  2012     1     10  2014     1      2  2011
# 2     2 2011-07-01     1     1      7  2011     0      4  2012     0     12  2012
# 3     3 2010-11-01     1     0      9  2012     1     11  2010     1     12  2011
# 4     4 NA             0     0      9  2014     0     10  2012     0      6  2012
# 5     5 2012-08-01     1     0      5  2014     0     10  2013     1      8  2012
# 6     6 2011-06-01     0    NA     NA    NA     1      6  2011     0     11  2014
# 7     7 2012-03-01     0     0      7  2011    NA     NA    NA     1      3  2012
# 8     8 NA            NA    NA     NA    NA    NA     NA    NA    NA     NA    NA
于 2018-08-20T11:41:14.543 回答