2

我正在尝试创建一个包含多个变量的堆积条形图,但我遇到了两个问题:

1)我似乎无法让旋转的 y 轴显示百分比而不是计数。

2)我想根据“非常同意”响应的百分比对变量(desc)进行排序。

这是我到目前为止的一个例子:

require(scales)
require(ggplot2)
require(reshape2)

# create data frame
  my.df <- data.frame(replicate(10, sample(1:4, 200, rep=TRUE)))
  my.df$id <- seq(1, 200, by = 1)

# melt
  melted <- melt(my.df, id.vars="id")

# factors
  melted$value <- factor(melted$value, 
                         levels=c(1,2,3,4),
                         labels=c("strongly disagree", 
                                  "disagree", 
                                  "agree", 
                                  "strongly agree"))
# plot
  ggplot(melted) + 
    geom_bar(aes(variable, fill=value, position="fill")) +
    scale_fill_manual(name="Responses",
                      values=c("#EFF3FF", "#BDD7E7", "#6BAED6",
                               "#2171B5"),
                      breaks=c("strongly disagree", 
                               "disagree", 
                               "agree", 
                               "strongly agree"),
                      labels=c("strongly disagree", 
                               "disagree", 
                               "agree", 
                               "strongly agree")) +
    labs(x="Items", y="Percentage (%)", title="my title") +
    coord_flip()

我要感谢几个人帮助我走到这一步。以下是 Google 提供的众多页面中的一些:

http://www.r-bloggers.com/fumblings-with-ranked-likert-scale-data-in-r/

创建堆叠条形图,其中每个堆栈都缩放为总和为 100%

sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] reshape2_1.2.2  ggplot2_0.9.2.1 scales_0.2.2   

loaded via a namespace (and not attached):
 [1] colorspace_1.2-0    dichromat_1.2-4     digest_0.6.0        grid_2.15.0         gtable_0.1.1        HH_2.3-23          
 [7] labeling_0.1        lattice_0.20-10     latticeExtra_0.6-24 MASS_7.3-22         memoise_0.1         munsell_0.4        
[13] plyr_1.7.1          proto_0.3-9.2       RColorBrewer_1.0-5  rstudio_0.97.237    stringr_0.6.1       tools_2.15.0       
4

2 回答 2

4

由于您使用的是李克特数据,您可能需要考虑likert()包 HH 中的函数。(希望可以将您指向另一个方向,因为已经有一个很好的答案来解决您原来的 ggplot2 方法。)

正如人们可能希望的那样,likert()以一种与李克特相称的方式进行情节,并尽量减少斗争。PositiveOrder=TRUE将按照项目在正方向上延伸的距离对项目进行排序。该ReferenceZero参数将允许您通过中性项目的中间进行零中心(下面不需要,但在此处显示)。并将as.percent=TRUE计数转换为百分比并在边距中列出实际计数(除非我们将其关闭)。

library(reshape2)
library(HH)

# create data as before
my.df <- data.frame(replicate(10, sample(1:4, 200, rep=TRUE)))
my.df$id <- seq(1, 200, by = 1)

# melt() and dcast() with reshape2 package
melted <- melt(my.df,id.var="id", na.rm=TRUE)
summd <- dcast(data=melted,variable~value, length) # note: length()
                                                   # not robust if NAs present

# give names to cols and rows for likert() to use
names(summd) <- c("Question", "strongly disagree", 
                              "disagree", 
                              "agree", 
                              "strongly agree")
rownames(summd) <- summd[,1]  # question number as rowname
summd[,1] <- NULL             

# plot
likert(summd,
       as.percent=TRUE,       # automatically scales
       main = NULL,           # or give "title",
       xlab = "Percent",      # label axis
       positive.order = TRUE, # orders by furthest right
       ReferenceZero = 2.5,   # zero point btwn levels 2&3
       ylab = "Question",     # label for left side
       auto.key = list(space = "right", columns = 1,
                     reverse = TRUE)) # make positive items on top of legend

在此处输入图像描述

于 2012-12-28T03:19:43.387 回答
3

对于 (1)
要获得百分比,您必须创建一个data.framefrom melted。至少这是我能想到的方式。

# 200 is the total sum always. Using that to get the percentage
require(plyr)
df <- ddply(melted, .(variable, value), function(x) length(x$value)/200 * 100)

然后提供计算出的百分比,如下所示weightsgeom_bar

ggplot(df) + 
geom_bar(aes(variable, fill=value, weight=V1, position="fill")) +
scale_fill_manual(name="Responses",
                  values=c("#EFF3FF", "#BDD7E7", "#6BAED6",
                           "#2171B5"),
                  breaks=c("strongly disagree", 
                           "disagree", 
                           "agree", 
                           "strongly agree"),
                  labels=c("strongly disagree", 
                           "disagree", 
                           "agree", 
                           "strongly agree")) +
labs(x="Items", y="Percentage (%)", title="my title") +
coord_flip()

我不太明白(2)。你想(a)计算relative percentages(参考为“强烈同意”吗?或者(b)你是否希望情节总是显示“强烈同意”,然后“同意”等。你可以通过只是完成(b)重新排序 df 中的因子,

df$value <- factor(df$value, levels=c("strongly agree", "agree", "disagree", 
                 "strongly disagree"), ordered = TRUE)

Edit:您可以按照您需要的顺序重新排列级别variablevalue顺序,如下所示:

variable.order <- names(sort(daply(df, .(variable), 
                     function(x) x$V1[x$value == "strongly agree"] ), 
                     decreasing = TRUE))
value.order <- c("strongly agree", "agree", "disagree", "strongly disagree")
df$variable <- factor(df$variable, levels = variable.order, ordered = TRUE)
df$value <- factor(df$value, levels = value.order, ordered = TRUE)
于 2012-12-28T01:49:11.657 回答