我有一个如下所示的data.frame。
toolid startdate enddate stage
abc 1-Jan-13 5-Jan-13 production
abc 6-Jan-13 10-Jan-13 down
xyz 3-Jan-13 8-Jan-13 production
xyz 9-Jan-13 15-Jan-13 down
我想将 data.frame 转换为以下格式。我正在尝试将列'startdate'
和'enddate'
从上面的 data.frame 组合成一个名为'date'
下面的列。我拥有的原始数据在toolids
许多阶段有几千行。我已经找到了一种使用 SQL 的方法,但更喜欢 R 解决方案。我已经开始融化数据,如下面的代码所示。
toolid date stage
abc 1-Jan-13 production
abc 2-Jan-13 production
abc 3-Jan-13 production
abc 4-Jan-13 production
abc 5-Jan-13 production
abc 6-Jan-13 down
abc 7-Jan-13 down
abc 8-Jan-13 down
abc 9-Jan-13 down
abc 10-Jan-13 down
xyz 3-Jan-13 production
xyz 4-Jan-13 production
xyz 5-Jan-13 production
xyz 6-Jan-13 production
xyz 7-Jan-13 production
xyz 8-Jan-13 production
xyz 9-Jan-13 down
xyz 10-Jan-13 down
xyz 11-Jan-13 down
xyz 12-Jan-13 down
xyz 13-Jan-13 down
xyz 14-Jan-13 down
xyz 15-Jan-13 down
R代码
startdate=c('1-Jan-13','6-Jan-13','3-Jan-13','9-Jan-13')
enddate=c('5-Jan-13', '10-Jan-13', '8-Jan-13', '15-Jan-13')
toolid=c('abc', 'abc', 'xyz', 'xyz')
stage=c('production', 'down', 'production', 'down')
data=data.frame(toolid,startdate,enddate,stage)
require(reshape2)
newdata=melt(data,id.vars=c('toolid','stage'))
更新:来自@Ananda Mahto答案的应对代码并添加几行代码以提供数据透视表类型的输出
## Convert "startdate" and "enddate" to date objects
data$startdate <- as.Date(data$startdate, format="%d-%b-%y")
data$enddate <- as.Date(data$enddate, format="%d-%b-%y")
## Use `seq` to create the date sequence, and manually recreate
## your dataframe. `do.call(rbind, ...) to put it back together
ddd=do.call(rbind, lapply(sequence(nrow(data)), function(x) {
data.frame(toolid = data$toolid[x],
date = seq(data$startdate[x], data$enddate[x], by = 1),
stage = data$stage[x])
}))
ddd
toolid date stage
1 abc 2013-01-01 production
2 abc 2013-01-02 production
3 abc 2013-01-03 production
4 abc 2013-01-04 production
5 abc 2013-01-05 production
6 abc 2013-01-06 down
7 abc 2013-01-07 down
8 abc 2013-01-08 down
9 abc 2013-01-09 down
10 abc 2013-01-10 down
11 xyz 2013-01-03 production
12 xyz 2013-01-04 production
13 xyz 2013-01-05 production
14 xyz 2013-01-06 production
15 xyz 2013-01-07 production
16 xyz 2013-01-08 production
17 xyz 2013-01-09 down
18 xyz 2013-01-10 down
19 xyz 2013-01-11 down
20 xyz 2013-01-12 down
21 xyz 2013-01-13 down
22 xyz 2013-01-14 down
23 xyz 2013-01-15 down
ddd1=dcast(ddd,date~stage)
ddd1
date down production
1 2013-01-01 0 1
2 2013-01-02 0 1
3 2013-01-03 0 2
4 2013-01-04 0 2
5 2013-01-05 0 2
6 2013-01-06 1 1
7 2013-01-07 1 1
8 2013-01-08 1 1
9 2013-01-09 2 0
10 2013-01-10 2 0
11 2013-01-11 1 0
12 2013-01-12 1 0
13 2013-01-13 1 0
14 2013-01-14 1 0
15 2013-01-15 1 0