3

我有为数据集中的每个单独 ID 创建的 rle() 类对象,现在我想将它们绘制在单独的直方图中,以显示各种长度类的频率,以便了解它们的分布,但我可以' t似乎弄清楚如何做到这一点。

我使用以下代码通过对具有各种 ID 的数据运行 rle() 函数来获得 rle() 类对象的列表:

list.runs<-dlply(data.1, .(ID), function(x) rle(x$flights))

但这使得无法将数据传输到数据帧中,因为 rle() 对象无法强制转换为数据帧。因此我对它们进行了分类:

list.runs<-dlply(data.1, .(ID), function(x) unclass(rle(x$flights)))

但我不能把这些数据放在数据框中,因为列表的长度不同。

runs<-ldply(do.call(data.frame,list.runs))

Error in function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
arguments imply differing number of rows: 14, 13

问题:如何绘制每个单独 ID 的长度值的直方图?

数据(简化):

> dput(data.1)
structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), flights = c(1, 1, 1, 
1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 
0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 
1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 
1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 
1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1)), .Names = c("ID", "flights"
), row.names = c(NA, -100L), class = "data.frame")
4

2 回答 2

6

我不知道你想做什么,但我会在这里展示如何:

require(plyr)
list.runs <- ddply(data.1, .(ID), function(x) {
    rr <- rle(x$flights)
    data.frame(freq=rr$lengths, xvar=seq_along(rr$lengths))
})

require(ggplot2)
ggplot(data = list.runs, aes(x = factor(xvar), y = freq)) + 
        geom_bar(stat = "identity", aes(fill=factor(ID))) + 
          facet_wrap( ~ ID, ncol=2)

给你:

在此处输入图像描述

编辑:遵循 OP 的评论:您也可以直接从这些数据中获取。事实上,您不必为您的要求生成“xvar”。来自list.runs

ggplot(data = list.runs, aes(x = factor(freq))) + 
     geom_bar(aes(weights = ..count.., fill=factor(ID))) + 
     facet_wrap( ~ ID, ncol=2)

给出:

在此处输入图像描述

于 2013-03-12T14:14:22.737 回答
1

I think @Arun's method of going straight to the data.frame in a ddply call is the way to go, but just to show one way of how you could go from your list.runs object to a useful data.frame:

df.summary <- ldply(list.runs,function(x,...) do.call(data.frame,x))

library(ggplot2)
ggplot(df.summary, aes(factor(lengths),values)) + 
  geom_bar(stat = "identity", aes(fill=factor(ID))) + 
  facet_grid( ~ ID, ncol=2)

enter image description here

于 2013-03-12T14:20:24.003 回答