源于一个较早的问题,我试图检查父母是否是给定孩子基因型的正确父母(请参阅检查指定变量中的“字符串”表达式是否包含在其他几个变量中)
现在,我想看看孩子的基因型是否是隐性基因型(这个基因有两个相同的隐性值[等位基因])。在这种情况下,孩子总是受到疾病的影响,而父母却没有(孩子是先证者)。我试图弄清楚父母和孩子是否是纯合子,并且有点弄清楚孩子是否与父母的基因型相匹配....但是从这两条信息我似乎无法确定孩子是否是纯合隐性...
这是我到目前为止所拥有的(按照上面的类似答案):
homo <- read.table("/.../Family1a/Family1a_vcf.txt", sep="\t", header=T)
d <- data.frame(list(mom = homo[c(1)],
dad = homo[c(2)],
child = homo[c(3)]
), stringsAsFactors = FALSE)
check_homo <- function(x) {
#homo
m1 <- sapply(strsplit(as.character(d[,2]),"/"),function(x) x[1])
m2 <- sapply(strsplit(as.character(d[,2]),"/"),function(x) x[2])
d1 <- sapply(strsplit(as.character(d[,1]),"/"),function(x) x[1])
d2 <- sapply(strsplit(as.character(d[,1]),"/"),function(x) x[2])
c1 <- sapply(strsplit(as.character(d[,3]),"/"),function(x) x[1])
c2 <- sapply(strsplit(as.character(d[,3]),"/"),function(x) x[2])
mom_homo <- m1 == m2
dad_homo <- d1 == d2
child_homo <- c1 == c2
homo_matrix_d <- matrix(c(dad_homo,child_homo), ncol=2, byrow=TRUE)
homo_matrix_m <- matrix(c(mom_homo,child_homo), ncol=2, byrow=TRUE)
homo_match_dc <- rowSums(homo_matrix_d)
homo_match_mc <- rowSums(homo_matrix_m)
#which ones equal parents
fam <- strsplit(as.character(d[c(1, 2, 3)]), "/")
names(fam) <- c("mom", "dad", "child")
mom_query <- fam[["child"]] == fam[["mom"]]
dad_query <- fam[["child"]] == fam[["dad"]]
fam_matrix <- matrix(c(mom=mom_query, dad=dad_query), nrow=2)
child_match_parents <- rowSums(fam_matrix)
#if child doesn't match parents and child_homo = recessive
#if child does equal parents,if homo_parent and homo_child then child = dominant
child_rec <- ifelse((child_match_parents < 1 & child_homo == "TRUE"), "RECESSIVE", "OTHER")
child_dom <- ifelse((child_match_parents != 0 & child_homo == "TRUE") & (mom_homo == "TRUE" | dad_homo == "TRUE"), "DOMINANT", "OTHER")
}
child_recessive_tmp <- data.frame(apply(x, 1, check_homo))
循环中的最后两行不起作用。这整件事可能是错误的,所以我不介意沮丧的反应。总而言之,我想要一个变量来说明孩子的基因型是否是纯合隐性的。
编辑:
数据示例:每行有一个 SNP。
Mom Dad Child
rs1 A/A G/G A/G
rs2 T/C T/C T/T
rs3 A/A C/A A/A
.
.
.
rs100 G/C A/G C/A