我有 7 个数据集,每个数据集都有两种类型的数据框:元数据,包含一个超级重要的列,显示谁是响应者,谁不是响应者,以及一个关于细胞类型的数据框。
使用 dput的示例:这是来自其中一个数据集的示例。第一个数据帧是细胞数据帧,第二个是元数据,其中包含有关药物益处(响应/无响应)的信息:
cells1 <- structure(c(8.10937548981953e-20, 0.095381661829093, 0.054868371418562,
0.0523687378840825, 0.0100173293159538, 0.0332395245437795, 3.37811149975583e-20,
0.048191378909587, 0.13314908462763, 0, 0.00612878313809124,
0, 0.00117409520254045, 1.33684197233784, 0.0701023734195797,
0.290756813286141, 0.349392264371762, 0.169367429138566, 0.00209460699328093,
0.205599458004829, 0.318048653115709, 4.21796249339787e-05, 0.00844407692255898,
0, 0.00613007026042523, 0.0300024082993193, 0.0405191646567986,
0.00654087887823056, 0.0111094954094255, 1.30617589099212e-19,
0.0398730537850546, 0.0390946117756341, 0.239413780024853, 2.07521807718399e-19,
0.00116980239850497, 0), .Dim = c(6L, 6L), .Dimnames = list(c("Adipocytes",
"B-cells", "Basophils", "CD4+ memory T-cells", "CD4+ naive T-cells",
"CD4+ T-cells"), c("Pt1", "Pt10", "Pt101", "Pt103", "Pt106",
"Pt11")))
这些数据集是关于癌症治疗的。中的列cells1
是样本,行是单元格类型。这是所有 7 个数据集中的方式。所有的行都完全相同,而样本不同(因此在每个数据集中都有不同数量的样本)。其中一些样本是响应者,有些是非响应者。
元数据:
Metadata <- structure(list(`Mutation Load` = c("NA", "75", "10", "21", "700",
"106"), `Neo-antigen Load` = c("NA", "33", "5", "5", "219", "67"
), `Neo-peptide Load` = c("NA", "56", "6", "11", "273", "187"
), `Cytolytic Score` = c("977.86911190000001", "65.840716889999996",
"1392.1422339999999", "1108.8620289999999", "645.54163300000005",
"602.6740413"), Benefit = c("No Response", "No Response", "Response",
"No Response", "No Response", "No Response")), row.names = c("Pt1",
"Pt10", "Pt101", "Pt103", "Pt106", "Pt11"), class = "data.frame")
目标:加入单元格数据框(我使用 cbind 完成),现在在我有一个包含 1000 多列且只有 38 行的大数据框后,我需要构建两个t-SNE 图,一个是按数据集(cells1)为样本着色, cells2, cells6 ...) ,第二个是通过响应(响应/无响应)对样本进行着色。
我的代码:我尝试按数据集着色,我认为示例名称列表是个好主意,但被困在那里:
## Combine Cells dataframes
Total_cells = cbind(cells1, cells2, cells6, cells7, cells9, cells12, cells15)
## Color t-SNE by dataset & color by response
Mylist = list(df1 = c(colnames(cells1)), df2 = c(colnames(cells2)),
df6 = c(colnames(cells6)), df7 = c(colnames(cells7)),
df9 = c(colnames(cells9)), df12 = c(colnames(cells12)) ,df15 = c(colnames(cells15)))
t-SNE= Rtsne(t( Total_cells), perplexity = 15)
plot(t-SNE$Y, col = Mylist, pch = 15)
legend("topright",
legend=unique(Mylist), cex = 0.5,
fill =palette("default"),
border=NA,box.col=NA)
如果需要任何其他信息,请告诉我