我有一份蛋白质及其相互作用物的列表,我有兴趣了解不同蛋白质之间共享相互作用物的百分比。
我的蛋白质和相互作用物列表如下所示:
head(lista)
$`A1CF `
[1] " A1CF" " APOBEC1" " CUGBP2" " KHSRP" " SYNCRIP" " TNPO2"
$`A2LD1 `
[1] " A2LD1" " PRPSAP2" " RPL15" " TANC1"
$`A2M `
[1] " A2M" " ADAM19" " ADAMTS1" " AMBP" " ANXA6" " APOE" " APP" " B2M" " C11orf58" " CELA1" " CPB2" " CTSB" " CTSE"
[14] " F2" " HSPA5" " IL10" " IL1B" " KLK13" " KLK2" " KLK3" " KLK5" " KLKB1" " LCAT" " LEP" " LRP1" " MMP2"
[27] " MYOC" " NGF" " PAEP" " PDGFA" " PDGFB" " PLG" " SERPINA1" " SHBG" " SPACA3" " TGFBI"
$`AAAS `
[1] " AAAS" " ARHGAP1" " BANF1" " CCNG2" " EP300" " HMGA1" " KPNB1" " NUP107" " NUP133" " NUP153" " NUP155" " NUP160" " NUP188" " NUP205"
[15] " NUP210" " NUP214" " NUP35" " NUP37" " NUP43" " NUP50" " NUP54" " NUP62" " NUP85" " NUP88" " NUP93" " NUP98" " NUPL1" " NUPL2"
[29] " PLK4" " POM121C" " PSIP1" " RAE1" " RAN" " RANBP2" " SEH1L" " TARDBP" " TPR" " TTK" " XPO1"
$`AAGAB `
[1] " AAGAB" " AFTPH" " EIF3C" " UNC119"
$`AAK1 `
[1] " AAK1" " ACOX3" " ADAM28" " ALPK3" " AURKB" " AZI2" " BMP2K" " CABC1" " CAMK2G" " DCK" " DCTPP1" " EIF2AK1" " FAM83A"
[14] " FER" " FRYL" " GAPVD1" " GFPT1" " HIPK1" " JAK1" " KIAA0195" " KIAA0528" " LIMK2" " LSM14A" " MAP4K2" " MAP4K5" " MAPK6"
[27] " NEK11" " NQO2" " NUMB" " PDE4A" " PIP4K2C" " PKN3" " PRKAA1" " PTPN18" " SIK2" " SIK3" " SPEG" " TAOK1" " TAOK3"
[40] " TBK1" " TBKBP1" " TESK2" " TMX1" " TNK1" " ZAK"
为了获得蛋白质之间共享相互作用的百分比,我做了以下工作:
我创建了一个尺寸等于长度的矩阵lista
M=matrix();
length(M) = 9794^2;
dim(M) = c(9794, 9794);
#A function to calculate the interactors shared among proteins
dFun3 <- function(x,y){length(which(x%in%y))/length(x)};
#To create a matrix with percentage of intereactors shared among proteins (note that the matrix is non-symmentric, being AxB different from BxA, with A and B being proteins)
for (i in 1:length(lista))
{
for (j in 1:length(lista))
{
k = dFun3(lista[[i]], lista[[j]])
M[i,j] = k;
}
}
现在我有一个矩阵显示 和 之间的AxB
比较BxA
。我现在想做的是比较蛋白质 i 和蛋白质 j 的值,这个想法是比较AxB
vsBxA
和 ifAxB is > 0.7
并BxA < 0.7
删除 A 蛋白质。我的方法是制作这样的 for 循环:
for (i in 1:nrow(M))
{
for (j in 1:ncol(M))
{
if (x[i,] > 0.7 & x[,j] < 0.7) {x[i,] <- "-1"}
if (x[,j] > 0.7 & x[i,] <0.7) {x[,j] <- "+1"}
}
}
使用这种方法,我假装在 +1 和 -1 比较中去除蛋白质。
然而,这种方法需要很长时间......任何建议都会非常受欢迎。
谢谢