r - R匹配超过2个条件并返回响应值

Question

您好我有两个数据集，其中第一个是一组索引：

ind1<-rep(c("E","W"), times=20)
ind2<-sample(100:150, 40)
y<-c(1:40)
index<-data.frame(cbind(ind1, ind2, y))

第二个数据集是需要索引的数据集。

x1<-sample(c("E","W","N"), 40, replace=TRUE)
x2<-sample(100:150, 40)
x3<-rep(0, times=40)
data<-data.frame(cbind(x1,x2,x3))

我想分别指出要与in匹配的x3位置x1和x2in并返回相应的.dataind1ind2indexy

index1<-split(index, index$ind1)
data1<-split(data, data$x1)
data1$E$x3<-match(data1$E$x2, index1$E$ind2)
data1$W$x3<-match(data1$W$x2, index1$W$ind2)

它有点符合我想要的方式，但没有y正确返回。我做错了哪一部分？谢谢。

另外，有没有更快/更智能的方法？因为我可能有更多的条件可以匹配。最初我尝试了 if else 语句，但没有奏效。

score 7 · Accepted Answer

merge(data, index, by.x=c("ind1", "ind2"), by.y=c("x1", "x2"), all.x=TRUE, all.y=FALSE)

将为您提供and和and的每个匹配组合的xand值。将保留and的所有组合（即使and的组合没有出现在中，但不会出现的 and 的组合将被删除。如所写，解决方案将保留和值，但如果您愿意根据@Ferdinand.kraft 的建议删除可以使用的值。yind1ind2x1x2x1x2ind1ind2indexind1ind2datax3yymerge(data[ ,-3], ...

score 4 · Accepted Answer

有很多方法可以解决这个问题，这实际上取决于数据的特征。这是最直接的匹配方法：

粘贴：“粘贴”功能允许您从多条数据创建字符串。如果您是使用具有相同匹配项的列在数据集之间进行匹配，您可以简单地将列粘贴在一起并使用“匹配”语句直接进行比较，如下所示：

new_data <- data

new_data$x3 <- ifelse(
    is.na(match(paste(data$x1, data$x2), paste(index$ind1, index$ind2))),
    0,
    index$y)

这里的 match 语句比较 x1+x2 和 ind1+ind2 对之间的精确匹配，并返回一个整数，指示哪个索引对位置对应于每个数据行。如果未找到匹配项，则返回 NA。通过在“ifelse”语句中检查 NA，我们为 NA 值写入零，并为任何匹配返回相应的 y 值。

score 4 · Accepted Answer

您也可以left_join()从dplyr包中使用：

require(dplyr)
left_join(data, index, by = c("x1" = "ind1", "x2" = "ind2"))

在这里阅读更多

score 1 · Accepted Answer

这个问题与基于多个列匹配两个 data.frames相关。

您可以按照Dinre的建议使用交互或粘贴来匹配多个列。

#Write the row number of index in x3 which matches
data$x3 <- match(interaction(data[c("x1", "x2")]), interaction(index[c("ind1","ind2")]))

#In case you want to return 0 instead of NA for nomatch
data$x3 <- match(interaction(data[c("x1", "x2")]), interaction(index[c("ind1","ind2")]), nomatch=0)

#Instead of >interaction< you could also use paste as already suggested by Dinre
data$x3 <- match(paste(data$x1, data$x2), paste(index$ind1, index$ind2))

r - R匹配超过2个条件并返回响应值

4 回答 4

Related

Reference