0

我有一个 csv 文件,每个唯一 id 有多行,我需要将其格式化为数据帧的单行。读入此文件后,我得到了一个初始数据框:

id  week   v1  v2
01  week1  3   2
01  week2  5   2
01  week3  2   3
02  week1  1   2
02  week2  5   5
03  week1  4   1
03  week2  4   3
03  week3  4   2
[etc...]

我想为给定的 id 提取 v1 的所有实例,所以我获取所有唯一的 id

uniqid<-unique(data$id)

然后从 1:length(uniqid) 遍历这些

temp <- subset(data,data$id==uniqid[i])

并将每周数据拉入临时变量

week1 <- temp$v1[temp$week=="week1]

所以我可以使用 rbind 改造数据框

output <- rbind(output,data.frame(ID=uniqid[i],week1,week2,week3))

我的问题是,例如 id=02,没有 week3,所以 rbind 中断。似乎从未创建 week3 变量;它不显示为 NA。如何测试变量是否已创建并将其设置为 NA (或 0),以便 rbind 不会失败?还是有完全不同/更有效的方法来实现这一点?

4

2 回答 2

1

在基础 R 中,您可以使用reshape

> reshape(mydf, direction = "wide", idvar="id", timevar="week")
  id v1.week1 v2.week1 v1.week2 v2.week2 v1.week3 v2.week3
1  1        3        2        5        2        2        3
4  2        1        2        5        5       NA       NA
6  3        4        1        4        3        4        2

如果要从输出中删除“v2”列,可以在重塑数据之前执行此操作,也可以从函数中删除它。

> reshape(mydf, direction = "wide", idvar="id", timevar="week", drop="v2")
  id v1.week1 v1.week2 v1.week3
1  1        3        5        2
4  2        1        5       NA
6  3        4        4        4
于 2013-10-22T13:15:58.803 回答
1

您可以使用recastreshape2 包中的函数。

DF
##   id  week v1 v2
## 1  1 week1  3  2
## 2  1 week2  5  2
## 3  1 week3  2  3
## 4  2 week1  1  2
## 5  2 week2  5  5
## 6  3 week1  4  1
## 7  3 week2  4  3
## 8  3 week3  4  2


require(reshape2)
temp <- recast(DF, id ~ week, measure.var = "v1")
result <- temp$data
row.names(result) <- temp$labels[[1]]$id
colnames(result) <- temp$labels[[2]]$week
result
##   week1 week2 week3
## 1     3     5     2
## 2     1     5    NA
## 3     4     4     4

或如@AnandaMahto 建议的那样,只需使用dcast

dcast(DF, id ~ week, value.var = "v1")
##   id week1 week2 week3
## 1  1     3     5     2
## 2  2     1     5    NA
## 3  3     4     4     4
于 2013-10-22T12:52:55.967 回答