我正在利用 R 中的 ggplot2 包按计数绘制功能类别。如下图所示,这些类别已按蛋白质计数以及它们所属的类别进行排序。
这是我正在使用的数据集的一部分:
GO_Category protein_count Class
aromatic amino acid family metabolic process 24 Amino acid metabolism
glutamine family amino acid metabolic process 14 Amino acid metabolism
aspartate family amino acid metabolic process 10 Amino acid metabolism
glutamine family amino acid biosynthetic process 9 Amino acid metabolism
branched-chain amino acid metabolic process 8 Amino acid metabolism
peptidyl-lysine modification to hypusine 4 Amino acid metabolism
ornithine metabolic process 3 Amino acid metabolism
single-organism carbohydrate metabolic process 125 Carbohydrate metabolism
carbohydrate biosynthetic process 55 Carbohydrate metabolism
pentose metabolic process 7 Carbohydrate metabolism
mannose metabolic process 3 Carbohydrate metabolism
organelle organization 101 Cellular components
ribonucleoprotein complex biogenesis 41 Cellular components
plastid organization 35 Cellular components
这是我在 R 中使用的代码:
nameorder <- df$GO_Category[order(df$Class, df$protein_count)]
df$GO_Category <- factor(df$GO_Category, levels=nameorder)
ggplot(data=df, aes(x=GO_Category, y=protein_count, fill=GO_Category)) +
geom_bar(color="black", stat="identity", width=0.5, position=position_dodge(.5)) +
coord_flip() +
guides(fill=FALSE) +
ylab("Protein Association Count") + xlab("Gene Ontology Category") +
theme(panel.grid.minor.y=element_blank(), panel.grid.major.y=element_blank(), axis.text.y=element_text(colour="#999999")) +
theme(panel.background = element_blank()) +
theme(text = element_text(size = 10)) +
geom_text(aes(label = protein_count), size = 3, hjust = -0.5)
我想做的是通过他们的类标识符来分面组,但保持 y 轴的结构。我在这方面的尝试产生了一些相当丑陋的图,这些图似乎在每个方面的 y 轴上重复了标签。