0

我有一个数据框(dat2):

> summary(dat2)
     combs             label                   Groups    
 Min.   :    1.00   Length:21172       (0,1]      :1573  
 1st Qu.:    4.00   Class :character   (1,5]      :5777  
 Median :    9.00   Mode  :character   (5,12]     :5632  
 Mean   :   86.46                      (12,30]    :4061  
 3rd Qu.:   24.00                      (30,100]   :2976  
 Max.   :49280.00                      (100,5e+04]:1153 

我已经从 stackoverflow 收集了一些代码,以便创建一个显示百分比的 4 面图栏。

ggplot(dat2,aes(x=Groups)) + 
  stat_bin(aes(n=nrow(dat2), y=..count../n)) +
  scale_y_continuous(formatter = "percent") + 
  facet_wrap(~ label)

问题是我想为每个子图重置计数器,因此每个标签组数据将除以该特定标签中的总行数而不是总数来计算。

4

1 回答 1

2

计算每个标签的观察次数并将其添加到您的数据集中

nLabel <- 4
nGroups <- 3
nObs <- 10000
dataset <- data.frame(label = factor(sample(nLabel, nObs, prob = runif(nLabel), replace = TRUE)))
library(plyr)
dataset <- ddply(dataset, .(label), function(x){
  data.frame(Groups = sample(nGroups, nrow(x), prob = runif(nGroups), replace = TRUE))
})
dataset$nLabel <- ave(dataset$Groups, by = dataset$label, FUN = length)
dataset$Groups <- factor(dataset$Groups)
library(ggplot2)
library(scales)
ggplot(dataset, aes(x = Groups)) + geom_histogram(aes(n = nLabel, y = ..count.. / n)) + facet_wrap(~label, scales = "free") + scale_y_continuous(label = percent)
于 2012-05-03T09:55:17.923 回答