r - 创建虚拟变量

Question

这就是我的数据的样子：

         Attribute        Time     V1 V2 V3 V4
1 pmEulRlcUserPacketThp 2013-04-30 12 51 34 17
2 pmEulRlcUserPacketThp 2013-04-30 84 28 17 10
3 pmEulRlcUserPacketThp 2013-04-30 11 43 28 15
4 pmEulRlcUserPacketThp 2013-04-30 80 26 17 91
5 pmEulRlcUserPacketThp 2013-04-26 10 41 25 13
6 pmEulRlcUserPacketThp 2013-04-25 97 35 23 12

我想创建一个虚拟列“t”，当日期相似时创建相同的值，例如，1 代表 2013-04-30 ，2 代表 26-04-2013 和 3 代表 25-04-2013。它的数据量很大，因此如果手工操作较少，将会很有帮助。我需要的数据如下：

         Attribute        Time     t V1 V2 V3 V4
1 pmEulRlcUserPacketThp 2013-04-30 1 12 51 34 17
2 pmEulRlcUserPacketThp 2013-04-30 1 84 28 17 10
3 pmEulRlcUserPacketThp 2013-04-30 1 11 43 28 15
4 pmEulRlcUserPacketThp 2013-04-30 1 80 26 17 91
5 pmEulRlcUserPacketThp 2013-04-26 2 10 41 25 13
6 pmEulRlcUserPacketThp 2013-04-25 3 97 35 23 12

score 2 · Accepted Answer

2

假设您的 data.frame 被调用dfr，请尝试：

dfr$t <- as.numeric(as.factor(dfr$Time))

于 2013-08-22T08:26:46.143 回答

score 0 · Accepted Answer

I can't tell if you are looking for just as.factor or if you need some kind of cumulative count of consecutive dates, in which case you could do this...

df$t <- cumsum( c( 1 , ! head(df$Time,-1) == tail(df$Time,-1) ) )

#              Attribute       Time V1 V2 V3 V4 t
#1 pmEulRlcUserPacketThp 2013-04-30 12 51 34 17 1
#2 pmEulRlcUserPacketThp 2013-04-30 84 28 17 10 1
#3 pmEulRlcUserPacketThp 2013-04-30 11 43 28 15 1
#4 pmEulRlcUserPacketThp 2013-04-30 80 26 17 91 1
#5 pmEulRlcUserPacketThp 2013-04-26 10 41 25 13 2
#6 pmEulRlcUserPacketThp 2013-04-25 97 35 23 12 3

We compare consecutive values of the Time column against one another to see if they are the same. Using the ! operator we get FALSE if they are the same and TRUE if they are different. We can then cumsum this to get out result (with an initial 1 to start the ball rolling).

r - 创建虚拟变量

2 回答 2

Related

Reference