r - 检查特定主题是否在另一列中并在 r 中的每一列中重复

Question

这是数据：示例1：完成

complete <- c("A", "B", "C","J", "C1", "L", "J2", "D", "M", "N")
lst1 <- c(NA, NA, NA, "A", "N", NA,"A", "C", "D", NA )
lst2 <- c(NA, NA, NA,"A", "L", NA, "C1", "J2", "J2", "B")
datf <- data.frame (complete, lst1, lst2, stringsAsFactors = FALSE)

示例 2：不完整和重复

complete <- c("A", "B", "C","J", "C1", "L", "C", "D", "M", "N")
lst1 <- c(NA, NA, NA, "A", "N", NA,"A", "C", "D1", NA )
lst2 <- c(NA, NA, NA,"A", "L", NA, "C1", "J2", "J2", "B2")
datf2 <- data.frame (complete, lst1, lst2, stringsAsFactors = FALSE)

我想检查：（1）lst1 和 lst2 的成员是否至少完整存在一次。如果不存在，则停止消息将显示此“？” 存在于 lst1 或 lst2 中（无论是否正确），但不完整。我的试验：例如1

if (datf$lst1 %in%  datf$complete | datf$lst2 %in%  datf$complete) {
     stop ("the subject in lst1 or lst2 must be complete list ")} else {
     cat("I am fine")
     }

I am fineWarning message:
In if (datf$lst1 %in% datf$complete | datf$lst2 %in% datf$complete) { :
  the condition has length > 1 and only the first element will be used

为什么会出现此错误消息，我该如何抑制它？

  Example 2:
    if (datf2$lst1 %in%  datf2$complete | datf2$lst2 %in%  datf2$complete) {
         stop ("the subject in lst1 or lst2 must be complete list ")} else {
         cat("I am fine")
         }
   Although there is potential errors the error message is same:
      I am fineWarning message:
    In if (datf2$lst1 %in% datf2$complete | datf2$lst2 %in% datf2$complete) { :
      the condition has length > 1 and only the first element will be used

还有办法提供不匹配的名称作为错误消息的一部分。

(2) 如果完整的任何成员被复制。

编辑：

Expected answer:
Example1 <-  all members of lst1 and lst2 are also member of complete 

expacted message here is "I am fine"

Example2 <-
B2, J2, is member of lst2 but not complete, D1 is member of lst1 but not complete. 
complete have two C, so C is duplicated. 
The function will stop and print a message 

"B2 and J2 are member of lst1, but not in complete 
 D1  is member of lst2, but not in complete,
 check completeness" 
"C is duplicated in complete"

score 1 · Accepted Answer

> datf$lst1 %in% datf$complete | datf$lst2 %in% datf$complete
 [1] FALSE FALSE FALSE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE

来自?'if'if 的参数是一个非 NA 的长度为 1 的逻辑向量。

> na.omit(datf2$lst1)[!na.omit(datf2$lst1)%in%datf2$complete]
[1] "D1"
> na.omit(datf2$lst2)[!na.omit(datf2$lst2)%in%datf2$complete]
[1] "J2" "J2" "B2"

> datf2$complete[duplicated(datf2$complete)]
[1] "C"

以上内容应该可以帮助您编写一个功能来执行您的建议。

r - 检查特定主题是否在另一列中并在 r 中的每一列中重复

1 回答 1

Related

Reference