Here is my example dataset
set.seed(123)
myd <- data.frame (sub = paste ("S", 1:10, sep = ""), P1 = sample(c(1,-1,2,0), 10, replace = TRUE),
P2 = sample(c(1,-1,2,0), 10, replace = TRUE),
I1 = sample(c(1,-1,2,0), 10, replace = TRUE),
I2 = sample(c(1,-1,2,0), 10, replace = TRUE),
I3 = sample(c(1,-1,2,0), 10, replace = TRUE),
I4 = sample(c(1,-1,2,0), 10, replace = TRUE),
I5 = sample(c(1,-1,2,0), 10, replace = TRUE),
I6 = sample(c(1,-1,2,0), 10, replace = TRUE)
)
myd
sub P1 P2 I1 I2 I3 I4 I5 I6
1 S1 -1 0 0 0 1 1 2 0
2 S2 0 -1 2 0 -1 -1 1 2
3 S3 -1 2 2 2 -1 0 -1 2
4 S4 0 2 0 0 -1 1 -1 1
5 S5 0 1 2 1 1 2 0 -1
6 S6 1 0 2 -1 1 1 -1 1
7 S7 2 1 2 0 1 1 0 -1
8 S8 0 1 2 1 -1 0 0 2
9 S9 2 -1 -1 -1 -1 0 0 -1
10 S10 -1 0 1 1 0 -1 -1 1
Translation table for incorrect values conditioned on values P1 and P2: -1 is missing value
Condition P1 P2 The value Incorrect
I 1 1 None
II 1 0 2
III 0 1 2
IV 2 0 2 or 0
V 0 2 2 or 0
VI 2 2 1 or 0
VII 1 2 0
VIII 2 1 0
# if there is -1 in any of the value produce all values NA
IX -1 0 NA
X 0 -1 NA
XI -1 -1 NA
XII -1 2 NA
XIII 2 -1 NA
XIV -1 1 NA
XV 1 -1 NA
The following is short code for transition table in data.frame format except** for IV, V, VI conditions where I did not know how to enter as there are two values:
ttable <- data.frame (P1 = c(1,1,0,2,0,2,1,2,-1, 0,-1,-1,2,-1,1),
P2 = c(1,0,1,0,2,2,2,1,0,-1,-1,2,-1,1,1),
errort = c("None", 2,2,2, 2,1,0,0,NA, NA, NA, NA, NA, NA,NA))
What I am trying to look at for each s1 to s10 rows, I would like to check values in P1 and P2 column and match this with the values in I1 to I6 column:
sub P1 P2 I1 I2 I3 I4 I5 I6
1 S1 -1 0 0 0 1 1 2 0
In this case P1 and P2 one of value is -1 so all values will be NA.
Another case:
sub P1 P2 I1 I2 I3 I4 I5 I6
S4 0 2 0 0 -1 1 -1 1
Here P1 = 0, P2 = 2, so the following values I1 = Incorrect, I2 = Incorrect, I3 = NA, I4 = correct, I5 = NA, I6 = correct
May be written as
sub P1 P2 I1 I2 I3 I4 I5 I6
S4 0 2 0 0 -1 1 -1 1
FALSE, FALSE, NA, TRUE, NA, TRUE
This match with condition (V) and either 0 or 1 are incorrect while 1 is correct and -1 is missing
Another case: here P1 = 0 and P2 =1, match with condition (III) in match table, thus incorrect values would be 2.
5 S5 0 1 2 1 1 2 0 -1
FALSE, TRUE, TRUE FALSE TRUE NA
I need to calculated frequency of false, I tried a lot of if-else statements but not giving desired output, I feel messey with many of these and I do not think this efficient for a large dataset I will be using.
qcfun <- function (x) {
x <- x[3:length(x)]
obs1 = table(c(x, 2, 0, 1, -1))
obs = obs1-1
ov <- NULL
if (x[1] == 1 & x[2] == 0){
ov = round (as.numeric (obs[4]/sum(obs)), 2)
} else {
if (x[1] == 0 & x[2] == 1){
ov = round (as.numeric (obs[4]/sum(obs)), 2)
} else {
if (x[1] == 1 & x[2] == 2){
ov = round (as.numeric (obs[2]/sum(obs)), 2)
} else {
if (x[1] == 2 & x[2] == 1){
ov = round (as.numeric (obs[2]/sum(obs)), 2)
} else {
if (x[1] == 1 & x[2] == 1){
ov = 0
} else {
ov = NA
}
}}}}
return (ov)
}
out1 <- apply(myd, 1,qcfun )
table (out1)
tout1 <- table (out1)
Is there a quick / efficient way of doing this?