1

我想匹配两个字符向量“A”和“B”中的项目,两个找出两件事:1)向量 A 中的项目是否出现在向量 B 中(是/否)和 2)向量 B 中的哪些项目没有出现在向量 A 中?

这两个向量如下所示:

A <- c("i", "u", "I", "U", "E", "V", "@", "{", "$", "#", "Q", "1", "2", "3", "4", "5", "6", "7", "8", "9")
B <- c("1", "1", "1", "1", "#", "$", "$", "1", "2", "2", "1", "d", "d", "i", "i", "i", "i", "1", "3", "2", "2", "F", "2", "2", "2", "5", "5", "5", "@", "5", "6", "5", "z", "z", "S", "S")

我可以用这个函数部分回答我的第一个问题:

test_match <- function(item_vector_A, item_vector_B){
ifelse(item_vector_A == item_vector_B, print(1), print(0))
}

lapply(A, B, FUN = test_match) -> results

但是,当我尝试这个时,我会得到该函数所做的每个比较的列表:

lapply(A, B, FUN = test_match) -> results
results
[[1]]
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[[2]]
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[[3]]
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
#etc.

我怎样才能得到一个简单的列表来指示 A 中的每个项目是否出现在 B (1) 或不 (0) 中,如下所示:

1 0 0 0 0 0 1 0 1 1 0 1 1 1 0 1 1 0 0 0 

当我尝试回答第二个问题时,我遇到了同样的问题:

test_non_match <- function(item_vector_A, item_vector_B){
ifelse(item_vector_B == item_vector_A, print("*match*"), print(item_vector_B))
}
lapply(A, B, FUN = test_non_match) -> results2
results2
[[1]]
[1] "1" "1" "1" "1" "#"  "$" "$" "1" "2" "2" "1" "d" "d" "*match*" "*match*" "*match*" "*match*" "1" "3" "2" "2" "F" "2" "2" "2" "5" "5" "5" "@" "5" "6" "5" "z" "z" "S" "S"      
[[2]]
[1] "1" "1" "1" "1" "#" "$" "$" "1" "2" "2" "1" "d" "d" "i" "i" "i" "i" "1" "3" "2" "2" "F" "2" "2" "2" "5" "5" "5" "@" "5" "6" "5" "z" "z" "S" "S"
[[3]]
[1] "1" "1" "1" "1" "#" "$" "$" "1" "2" "2" "1" "d" "d" "i" "i" "i" "i" "1" "3" "2" "2" "F" "2" "2" "2" "5" "5" "5" "@" "5" "6" "5" "z" "z" "S" "S"

它列出了整个向量,而我想要这样的东西:

[1] *match*
[1] *match*
[1] *match*
[1] *match*
[1] *match*
[1] *match*
[1] *match*
[1] *match*
[1] *match*
[1] *match*
[1] *match*
[1] d
[1] d
[1] *match*
[1] *match*
[1] *match*
[1] *match*
[1] *match*
[1] *match*
[1] *match*
[1] *match*
[1] F
[1] *match*
[1] *match*
[1] *match*
[1] *match*
[1] *match*
[1] *match*
[1] z
[1] z
[1] S
[1] S

我需要使用另一种类型的 apply() 函数吗?

4

2 回答 2

2

除了上述替代方案之外,您可能还想看一下 %chin% ,它是 data.table 包中 %in% 的更快版本:

ifelse (B %chin% A, "*match*", B)
于 2013-06-13T15:27:39.517 回答
1

您可以只使用%in%和测试A %in% Band !(B %in% A )。要重现您的问题中的输出:

as.numeric(A %in% B)
 [1] 1 0 0 0 0 0 1 0 1 1 0 1 1 1 0 1 1 0 0 0

正如 Ferdinand.kraft 所建议的:

ifelse (B %in% A, "*match*", B)
 [1] "*match*" "*match*" "*match*" "*match*" "*match*" "*match*" "*match*" "*match*" "*match*" "*match*" "*match*" "d"       "d"       "*match*" "*match*" "*match*" "*match*" "*match*" "*match*"
[20] "*match*" "*match*" "F"       "*match*" "*match*" "*match*" "*match*" "*match*"  "*match*" "*match*" "*match*" "*match*" "*match*" "z"       "z"       "S"       "S"      
于 2013-06-13T11:34:30.330 回答