16

如果我有两个数据框,例如:

df1 = data.frame(x=1:3,y=1:3,row.names=c('r1','r2','r3'))
df2 = data.frame(z=5:7,row.names=c('r5','r6','r7'))

(

R> df1
   x y
r1 1 1
r2 2 2
r3 3 3

R> df2
   z
r5 5
r6 6
r7 7

),我想按行名合并它们,保留所有内容(所以是外连接,或 all=T)。这样做:

merged.df <- merge(df1,df2,all=T,by='row.names')
R> merged.df
  Row.names  x  y  z
1        r1  1  1 NA
2        r2  2  2 NA
3        r3  3  3 NA
4        r5 NA NA  5
5        r6 NA NA  6
6        r7 NA NA  7

但我希望输入行名是输出数据帧(merged.df)中的行名。

我可以:

rownames(merged.df) <- merged.df[[1]]
merged.df <- merged.df[-1]

这有效,但似乎不优雅且难以记住。有人知道更清洁的方法吗?

4

2 回答 2

20

不确定是否更容易记住,但您可以使用transform.

transform(merge(df1,df2,by=0,all=TRUE), row.names=Row.names, Row.names=NULL)
#    x  y  z
#r1  1  1 NA
#r2  2  2 NA
#r3  3  3 NA
#r5 NA NA  5
#r6 NA NA  6
#r7 NA NA  7
于 2013-06-29T02:28:27.063 回答
2

From the help of merge:

If the matching involved row names, an extra character column called Row.names is added at the left, and in all cases the result has ‘automatic’ row names.

So it is clear that you can't avoid the Row.names column at least using merge. But maybe to remove this column you can subset by name and not by index. For example:

dd <- merge(df1,df2,by=0,all=TRUE) ## by=0 easier to write than row.names , 
                                   ## TRUE is cleaner than T

Then I use row.names to subset like this :

res <- subset(dd,select=-c(Row.names))
rownames(res) <- dd[,'Row.names']
  x  y  z
1  1  1 NA
2  2  2 NA
3  3  3 NA
4 NA NA  5
5 NA NA  6
6 NA NA  7
于 2013-06-29T02:01:50.537 回答