我正在尝试解决一个我无法通过谷歌搜索关键字解决的棘手的 R 问题。具体来说,我正在尝试获取一个数据帧的子集,其值不会出现在另一个数据帧中。这是一个例子:
> test
number fruit ID1 ID2
item1 "number1" "apples" "22" "33"
item2 "number2" "oranges" "13" "33"
item3 "number3" "peaches" "44" "25"
item4 "number4" "apples" "12" "13"
> test2
number fruit ID1 ID2
item1 "number1" "papayas" "22" "33"
item2 "number2" "oranges" "13" "33"
item3 "number3" "peaches" "441" "25"
item4 "number4" "apples" "123" "13"
item5 "number3" "peaches" "44" "25"
item6 "number4" "apples" "12" "13"
item7 "number1" "apples" "22" "33"
我有两个数据框,test 和 test2,目标是选择 test2 中未出现在 test 中的所有整行,即使其中一些值可能相同。
我想要的输出看起来像:
item1 "number1" "papayas" "22" "33"
item2 "number3" "peaches" "441" "25"
item3 "number4" "apples" "123" "13"
可能有任意数量的行或列,但在我的具体情况下,一个数据框是另一个数据框的直接子集。
我已经广泛使用了 R 的子集()、合并()和哪个()函数,但是如果可能的话,我不知道如何组合使用这些函数来获得我想要的东西。
编辑:这是我用来生成这两个表的 R 代码。
test <- data.frame(c("number1", "apples", 22, 33), c("number2", "oranges", 13, 33),
c("number3", "peaches", 44, 25), c("number4", "apples", 12, 13))
test <- t(test)
rownames(test) = c("item1", "item2", "item3", "item4")
colnames(test) = c("number", "fruit", "ID1", "ID2")
test2 <- data.frame(data.frame(c("number1", "papayas", 22, 33), c("number2", "oranges", 13, 33),
c("number3", "peaches", 441, 25), c("number4", "apples", 123, 13),c("number3", "peaches", 44, 25), c("number4", "apples", 12, 13) ))
test2 <- t(test2)
rownames(test2) = c("item1", "item2", "item3", "item4", "item5", "item6")
colnames(test2) = c("number", "fruit", "ID1", "ID2")
提前致谢!