我有一个包含 3 列(基因、varian_type 和样本)的数据框和两列中的另一列(路径和基因)。在第二个中,我列出了每个途径中的基因。所以现在我想创建一个包含 4 列(基因、变体类型、样本和通路)的新数据框,显示每个基因存在的一个或多个通路。有谁能帮助我吗?提前致谢。
1)
Hugo_Symbol Variant_Type Tumor_Sample_Barcode
1 ZAP70 SNP TCGA-E9-A1RC-01A-11D-A159-09
2)
structure(list(circuit_names = c("hsa04014__44", "hsa04014__33",
"hsa04014__37", "hsa04014__24", "hsa04014__26", "hsa04014__30"
), mutated = c("ZAP70,NF1,MAPK1,RAF1,CSF1R,RASGRP1,MAP2K1,MAP2K1,RASGRF1,RASGRF1,RASGRF1,RASGRF1,RASGRF1,NF1,PLCG1,PLCG1,PLCG1",
"ZAP70,NF1,AKT3,CSF1R,BAD,RASGRP1,RASGRF1,RASGRF1,RASGRF1,RASGRF1,RASGRF1,PIK3R5,NF1,BCL2L1,PLCG1,PLCG1,PLCG1,AKT3",
"ZAP70,NF1,AKT3,CSF1R,RASGRP1,RASGRF1,RASGRF1,RASGRF1,RASGRF1,RASGRF1,PIK3R5,NF1,PLCG1,PLCG1,PLCG1,FOXO4,AKT3",
"ZAP70,NF1,CSF1R,RGL2,RASGRP1,RASGRF1,RASGRF1,RASGRF1,RASGRF1,RASGRF1,NF1,PLCG1,PLCG1,PLCG1",
"ZAP70,NF1,CSF1R,RASGRP1,RASGRF1,RASGRF1,RASGRF1,RASGRF1,RASGRF1,NF1,PLCG1,PLCG1,PLCG1,PLCE1",
"ZAP70,NF1,CSF1R,RASGRP1,RASGRF1,RASGRF1,RASGRF1,RASGRF1,RASGRF1,NF1,PLCG1,PLCG1,PLCG1,PLCE1"
)), row.names = c(NA, 6L), class = "data.frame")
3)我不会这样的
structure(list(Hugo_Symbol = c("ZAP70", "TTN", "TTN", "PRKCD",
"PIK3CA", "TLR3"), Variant_Type = c("SNP", "SNP", "SNP", "SNP",
"SNP", "SNP"), Tumor_Sample_Barcode = c("TCGA-E9-A1RC-01A-11D-A159-09",
"TCGA-E9-A1RC-01A-11D-A159-09", "TCGA-E9-A1RC-01A-11D-A159-09",
"TCGA-E9-A1RC-01A-11D-A159-09", "TCGA-E9-A1RC-01A-11D-A159-09",
"TCGA-E9-A1RC-01A-11D-A159-09"), Pathways = c("hsa04014__44, hsa04014__33, hsa04014__37, hsa04014__24",
"hsa04530__11 20 16", "hsa04530__11 20 16", "hsa04722__37, hsa04722__35, hsa04722__33",
"hsa04151__25, hsa04151__37, hsa04151__73", "hsa04620__23")), row.names = c("6",
"8", "9", "11", "13", "16"), class = "data.frame")