-5

我有 2 个数据框,它们是:

df1                                              
Column1        Column2        
A               id1             
B               id2             
C               id3             
B               id2             
D               id4             
A               id1             
C               id3

df2
Column1      Column2      Column3
X             m1            m2
A             m3            m4
A             m3            m4
Y             n1            n2
A             m3            m4
Z             p1            p2
X             m1            m2

我想合并df1df2基于如果第 1 列中的行是的条件df1A它应该根据第 2 列有选择地合并第 2 列和第 3df2df1

所以最终的 df1 看起来像这样:

df1                                                                           

Column1        Column2.1    Column1.2      Column2.2      Column3.2
A               id1             id1             m3            m4
B               id2                                   
C               id3         
B               id2         
D               id4         
A               id1             id1             m3            m4
C               id3         

到目前为止,我已经通过专门提取 df1 的第 1 列中包含“A”的行来管理这个问题。然后我在 for 循环中应用了合并以获取df2. 是否有可能有一个 if 循环来帮助专门执行 和 之间的条件df1合并df2

这是df1and的结构df2

df1 <- structure(list(Column1 = structure(c(1L, 2L, 3L, 2L, 4L, 1L, 
3L), .Label = c("A", "B", "C", "D"), class = "factor"), Column2 = structure(c(1L, 
2L, 3L, 2L, 4L, 1L, 3L), .Label = c("id1", "id2", "id3", "id4"
), class = "factor")), .Names = c("Column1", "Column2"), class = "data.frame", row.names = c(NA, 
-7L))


df2 <- structure(list(Column1 = structure(c(2L, 1L, 1L, 3L, 1L, 4L, 
2L), .Label = c("A", "X", "Y", "Z"), class = "factor"), Column2 = structure(c(1L, 
2L, 2L, 3L, 2L, 4L, 1L), .Label = c("m1", "m3", "n1", "p1"), class = "factor"), 
    Column3 = structure(c(1L, 2L, 2L, 3L, 2L, 4L, 1L), .Label = c("m2", 
    "m4", "n2", "p2"), class = "factor")), .Names = c("Column1", 
"Column2", "Column3"), class = "data.frame", row.names = c(NA, 
-7L))
4

1 回答 1

0

如果 df1 和 df2 如上所述定义

library(sqldf)


final<-sqldf("select df1.Column1 as Column1 ,df1.Column2,(Select distinct df2.Column2 from df2 where df2.Column1=df1.Column1) as Column2_2,(Select distinct df2.Column3 from df2 where df2.Column1=df1.Column1)as Column3_2 from df1 left join df2 on df1.Column1=df2.Column2")


Column1.2<-ifelse(final$Column1=="A",final$Column2,NA)


final<-cbind(final,Column1.2)
于 2012-12-19T13:12:40.910 回答