0

我有一个数据列表,如下所示:

ID    col1    col1.1    col2    col2.1    col3    col3.1
rat    AG       AB       AG       AC       AA       AA
cat    BB       GG       BB       CC       AB       AG
dog    --       AB       AG       GG       CC       GG

我想比较两列中的每一列(即 col1 vs col1.1,col2 vs col2.1 ...),如果它们满足条件,它将结果添加到新列中。

所以让我们说条件是这样的:

if any base of one column matched with -- of another column, assign 0
if AG or AC of one column matched with AB of another column, assign 1
if AA of one column matched with AA of another column, assign 2
if BB of one column matched with GG or CC of another column, assign 3
if one does not match any of the condition above, assign 4

所以输出看起来像:

ID    col1    col1.1  OUT1  col2    col2.1  OUT2  col3  col3.1   OUT3  
rat    AG       AB     1     AG       AC     1     AA     AA      2
cat    BB       GG     3     BB       CC     3     AB     AG      1
dog    --       AB     0     AG       GG     4     BB     GG      3

如何在两个字符串之间进行比较并添加一个新列?

谢谢!

4

1 回答 1

2

这应该可以让您通过一些重新排列来获得所需的内容:

fnpair <- function(a) { if( a[1] =="--" | a[2]=="--"){0}else{
                  if( (a[1] %in% c("AG", "AC") & a[2] == "AB")|
                      (a[2] %in% c("AG", "AC") & a[1] == "AB") ){1}else{
                     if(  a[1] =="AA" & a[2] == "AA" ){2} else{
                        if(  (a[1] %in% c("GG","CC") & a[2] == "BB")|
                             (a[2] %in% c("GG","CC") & a[1] == "BB") ) {3} else{4} }}}}

 df1 <- read.table(text="ID    col1    col1.1    col2    col2.1    col3    col3.1
 rat    AG       AB       AG       AC       AA       AA
 cat    BB       GG       BB       CC       AB       AG
 dog    --       AB       AG       GG       CC       GG", header=TRUE)

 t( apply(df1[,2:7], 1, function(x) t( sapply(0:2, function(z) fnpair(x[2*z+c(1,2)]) ) ) ) )

#------------------

     [,1] [,2] [,3]
[1,]    1    4    2
[2,]    3    3    1
[3,]    0    4    4

要分配给新列,只需:

df1$newcol <- .Last.value

我认为你展示的手工计算不符合规则。

于 2013-07-12T06:25:12.820 回答