1

当我说出乎意料时,我的意思是对我来说出乎意料。请允许我举例说明。我们有两个data.frames:

b1<-data.frame(a=c("a","b"),b=1:2)
b2<-data.frame(a=c("a","b"),c=1:2)

合并产生以下

> merge(b1,b2)
  a b c
1 a 1 1
2 b 2 2

但是当我们有 data.frames

b1<-data.frame(a=c("a","a"),b=1:2)
b2<-data.frame(a=c("a","a"),c=1:2)

合并产生

> merge(b1,b2)
  a b c
1 a 1 1
2 a 1 2
3 a 2 1
4 a 2 2

当我期待

  a b c
  a 1 1
  a 2 2

为什么会出现两种不同的结果?

4

1 回答 1

3

This is by design. base merge uses match on columns specified (or not specified). In case 1, it found only a single match for each value of a so there were no duplicates. But in case 2, it found two matches:

> b1$a %in% b2$a 
[1] TRUE TRUE  

for each a and therefore returned all possible matches. See ?merge for more information. join in plyr has the option of matching only the first match.

> join(b1,b2, match="first")
Joining by: a
  a b c
1 a 1 1
2 a 2 1
于 2012-10-26T07:43:36.207 回答