2

我有一个条件表和要根据条件使用的公式:

MTR                 CLD                     AF                 PI
AA                  AA                     AB                   0.5/a      (cond1)
AA                  AB                     AB                  0.5/a      (cond2)
AB                  AA                     AB                   0.5 /a     (cond3)
AA                  AA                     AA                    1/a   (cond4)
AB                  AA                     AA                     1/a   (cond5)
BB                  AB                     AA                    1/a  (cond6)
AB                  AB                     AA                    1/(a+b) (cond7)
AB                  AB                     AB                    1/(a+b)  (cond8)

如果条件不匹配,则应产生“NA”。

 # table of conditions 
MTR <- c("AA",  "AA",   "AB",   "AA",   "AB",   "BB",   "AB",   "AB")
CLD <- c("AA",  "AB",   "AA",   "AA",   "AA",   "AB",   "AB",   "AB")
AF <- c("AB",   "AB",   "AB",   "AA",   "AA",   "AA",   "AA",   "AB")
PI <- c("0.5/a",    "0.5/a",    "0.5/a",    "1/a",  "1/a",  "1/a",  
      "1/(a+b)",    "1/(a+b)")

以下是要应用的两个数据集:

# the dataset to be applied to 
dataf <- data.frame (MTR = c("AB", "BB", "AB", "BB", "AB", "AA"),
                     CLD= c("AA", "AB", "AA", "AB", "AB", "AB"),
                     AF = c("AA", "AB", "BB", "AB", "BB", "AB")
                     )


     MTR CLD AF
1  AB  AA AA 
2  BB  AB AB 
3  AB  AA BB
4  BB  AB AB
5  AB  AB BB
6  AA  AB AB

a = c(0.5, 0.4, 0.3, 0.5, 0.2, 0.4)
mapd <- data.frame(a = a, b = 1-a)

编辑:按照建议,我可以将两个数据帧合并为一个:newdf <- data.frame (dataf, mapd)

 MTR CLD AF   a   b
1  AB  AA AA 0.5 0.5
2  BB  AB AB 0.4 0.6
3  AB  AA BB 0.3 0.7
4  BB  AB AB 0.5 0.5
5  AB  AB BB 0.2 0.8
6  AA  AB AB 0.4 0.6

我认为我可以通过创建 if else 来解决这个问题 - 但有很多条件我不确定这是否是唯一的(好)方法。

 PI = NULL
 if (dataf$MTR = "AA", dataf$CLD = "AA", dataf$AF = "AB") {
                          PI =  0.5/mapd$a } else {
  if (dataf$MTR = "AA", dataf$CLD = "AB", dataf$AF = "AB"){
                          PI =  0.5/mapd$a
                          } else {
         ............. so on

有没有其他选择?

4

2 回答 2

3

看起来过于复杂!

我的建议是将条件表制作成一个包含三列的数据框:第一列是 MTR、CLD 和 AF 列的粘贴(因此典型的条目可能是“AB~AA~AB”)和另外两列将称为 COEFA 和 COEFAB,它们是 PI 表达式中乘以 1/a 的系数和乘以 1/(a+b) 的系数...例如“0.5/a”将具有 COEFA = 0.5 和 COEFAB = 0 而 "1/(a+b)" 将有 COEFA = 0 和 COEFB = 1。

需要明确的是,条件 1 将以 MTR_CLD_AF = "AA~AA~AB", COEFA = 0.5, COEFB = 0 的形式表示。

然后要确定哪个条件适用于 dataf 中的每一行,您只需将 MTR、CLD 和 AF 粘贴在一起,将其与条件数据框中的 MTR_CLD_AF 列匹配,从而为该行提取 COEFA 和 COEFB。您的 PI 变量所需的值是 COEFA*(1/a) + COEFB*(1/(a+b))。

让我知道进一步的解释或代码是否会有所帮助:)

跟进:

这是我在这里使用的代码的一个刺探:

### first, all your object definitions...

MTR <- c("AA",  "AA",   "AB",   "AA",   "AB",   "BB",   "AB",   "AB")
CLD <- c("AA",  "AB",   "AA",   "AA",   "AA",   "AB",   "AB",   "AB")
AF <- c("AB",   "AB",   "AB",   "AA",   "AA",   "AA",   "AA",   "AB")
PI <- c("0.5/a",    "0.5/a",    "0.5/a",    "1/a",  "1/a",  "1/a",  
      "1/(a+b)",    "1/(a+b)")

dataf <- data.frame (MTR = c("AB", "BB", "AB", "BB", "AB", "AA"),
                     CLD= c("AA", "AB", "AA", "AB", "AB", "AB"),
                     AF = c("AA", "AB", "BB", "AB", "BB", "AB")
                     )
a = c(0.5, 0.4, 0.3, 0.5, 0.2, 0.4)
mapd <- data.frame(a = a, b = 1-a)

### first create COEFA and COEFAB from PI (could automate but small 
### enough to do manually here)

COEFA <- c(0.5, 0.5, 0.5, 1, 1, 1, 0, 0)
COEFAB <- c(0, 0, 0, 0, 0, 0, 1, 1)

### then create conditions data frame as specified in my solution

cond = data.frame(MTR_CLD_AF = paste(MTR,CLD,AF,sep="~"), COEFA, COEFAB,
                  stringsAsFactors=FALSE)

### now put all the data in dataf and mapd into one object alldata to 
### keep things neat

alldata = data.frame(MTR = dataf$MTR, CLD = dataf$CLD, AF = dataf$AF, 
                     a = mapd$a, b=mapd$b, stringsAsFactors=FALSE)

### now go ahead and get COEFA and COEFB for each row in alldata - first 
### do the match up (look in matcond to see this) then copy coef columns 
### over to alldata

matcond=cond[match(with(alldata, paste(MTR, CLD, AF, sep="~")),
                   cond$MTR_CLD_AF),]
alldata$COEFA = matcond$COEFA
alldata$COEFAB = matcond$COEFAB

### finally compute COEFA*(1/a) + COEFAB*(1/(a+b)) using the columns of 
### alldata, and store the answer in column called PI

alldata$PI = with(alldata, COEFA*(1/a) + COEFAB*(1/(a+b)))

### that's it! as noted elsewhere, the value will be NA if no matching
### condition exists
于 2012-06-06T19:33:54.070 回答
2
#data.frame with conditions and functions      
confun<-data.frame(c1=c("AA","AA","AB"),
                       c2=c("AA","AB","AA"),
                       c3=c("AB","AB","AB"),
                       fun=c("0.5/a","0.5/b","1/(a+b)"))
confun$fun<-as.character(confun$fun)

confun
      c1 c2 c3     fun
    1 AA AA AB   0.5/a
    2 AA AB AB   0.5/b
    3 AB AA AB   1/(a+b)

#data        
test<-data.frame(c1=c("AA","AB"),c2=c("AB","AA"),c3=c("AB","AB"),a=c(2,3))
    test$b<-1-test$a

test
  c1 c2 c3 a  b
1 AA AB AB 2 -1
2 AB AA AB 3 -2

fun<-function(c1,c2,c3) as.numeric(rownames(confun[paste(confun$c1,confun$c2,confun$c3)==paste(c1,c2,c3),]))    
test$i<-mapply(fun,test$c1,test$c2,test$c3)

fun2<-function(a,b,i) eval(parse(text=confun$fun[i]))

res<-mapply(fun2,test$a,test$b,test$i)

res
[1] -0.5  1.0

又快又脏

于 2012-06-06T19:15:15.523 回答