1
year <- c(2000:2014)
group <- c("A","A","A","A","A","A","A","A","A","A","A","A","A","A","A",
         "B","B","B","B","B","B","B","B","B","B","B","B","B","B","B",
         "C","C","C","C","C","C","C","C","C","C","C","C","C","C","C")
value <- sample(1:5, 45, replace=TRUE)

df <- data.frame(year,group,value)
df$value[df$value==1] <- NA

   year group value
1  2000     A    NA
2  2001     A     2
3  2002     A     2
...
11 2010     A     2
12 2011     A     3
13 2012     A     5
14 2013     A    NA
15 2014     A     3
16 2000     B     2
17 2001     B     3
...
26 2010     B    NA
27 2011     B     5
28 2012     B     4
29 2013     B     3
30 2014     B     5
31 2000     C     5
32 2001     C     4
33 2002     C     3
34 2003     C     4
...
44 2013     C     5
45 2014     C     3

以上是我的问题的示例数据框。每个组(A、B 或 C)的值从 2000 年到 2014 年,但在某些年份,某些组的值可能会丢失。

我想绘制的图表如下:

x轴是年份

y 轴是组(即 A、B 和 C 应显示在 y-lab 上)

条形或线表示每个组的价值可用性

如果值为 NA,则该条将不会在该时间点显示。如果可能,首选 ggplot2。

任何人都可以帮忙吗?谢谢你。

我认为我的描述令人困惑。我期待如下图,但 x 轴将是年份。条形或线表示给定组的值在一年中的可用性。

在 A 组的样本数据框中,我们有

2012 A 5
2013 A NA
2014 A 3

那么2013年A组点应该什么都没有,然后2014年A组点就会出现一个点。

在此处输入图像描述

4

1 回答 1

2

您可以使用没有范围的 geom_errorbar(geom_errorbarh 表示水平)。然后只是 complete.cases 的子集(或!is.na(df$value)

library(ggplot2)

set.seed(10)

year <- c(2000:2014)
group <- c("A","A","A","A","A","A","A","A","A","A","A","A","A","A","A",
       "B","B","B","B","B","B","B","B","B","B","B","B","B","B","B",
       "C","C","C","C","C","C","C","C","C","C","C","C","C","C","C")
value <- sample(1:5, 45, replace=TRUE)

df <- data.frame(year,group,value)
df$value[df$value==1] <- NA

no_na_df <- df[complete.cases(df), ]

ggplot(no_na_df, aes(x=year, y = group)) + 
    geom_errorbarh(aes(xmax = year, xmin = year), size = 2)

在此处输入图像描述

编辑: 要获得一个计数酒吧,您可以使用这种稍微不吸引人的方法。必须对组数据进行数字表示,以给条形指定宽度。此后,我们可以使尺度再次将变量表示为离散的。

df$group_n <- as.numeric(df$group)

no_na_df <- df[complete.cases(df), ]

ggplot(no_na_df, aes(xmin=year-0.5, xmax=year+0.5, y = group_n)) + 
    geom_rect(aes(ymin = group_n-0.1, ymax = group_n+0.1)) +
    scale_y_discrete(limits = levels(df$group))

在此处输入图像描述

于 2015-10-16T07:00:54.057 回答