2

I have a dataframe that looks like this:

 n <- c("foo","bar","qux","qux","bar")
 k <- c(100,200,300,400,500)
 z <- c("z","w","x","y","v")
 df1 <- data.frame(n,k,z)
 df1 
   n   k z
1 foo 100 z
2 bar 200 w
3 qux 300 x
4 qux 400 y
5 bar 500 v

Given a second data frame

l <- c("k1","k2","k3","k4","k5")
n2 <- c("foo","bar","qux","qux","bar")  # name difference of (n2) is intentional
df2 <- data.frame(n2,l)
   n2 l
1 foo k1
2 bar k2
3 qux k3
4 qux k4
5 bar k5

I want to create the third data frame with the following condition:

Use df1 as a source to create third dataframe and the checking reference is n for every row in df1 with respect to n2 of df2.

So in the end I'd like to have this:

  n   k z   call
1 foo 100 z k1
2 bar 200 w k2
3 qux 300 x k3
4 qux 400 y k3
5 bar 500 v k2

What's the way to do it?

4

1 回答 1

3

我想你正在寻找match

match 返回其第二个参数的第一个参数的(第一个)匹配位置的向量。

m <- df1
cbind(m,call=df2$l[match(df1$n ,df2$n2)])
    n   k z call
1 foo 100 z   k1
2 bar 200 w   k2
3 qux 300 x   k3
4 qux 400 y   k3
5 bar 500 v   k2

另一种选择是使用merge,但您应该删除重复的:

hh <- merge(df1,df2,by.x='n',by.y='n2')
hh[!duplicated(hh[,1:3]),]
    n   k z  l
1 bar 200 w k2
3 bar 500 v k2
5 foo 100 z k1
6 qux 300 x k3
8 qux 400 y k3
于 2013-06-25T10:40:18.443 回答