我有一个数据框如下:
df <- data.frame(as.date=c("14/06/2016","15/06/2016","16/06/2016","17/06/2016","18/06/2016","19/06/2016","20/06/2016","21/06/2016","22/06/2016","23/06/2016",
"24/06/2016","04/07/2016","05/07/2016","06/07/2016","07/07/2016","08/07/2016","09/07/2016","10/07/2016","11/07/2016","12/07/2016",
"13/07/2016","14/07/2016","15/07/2016","17/07/2016","18/07/2016","19/07/2016","20/07/2016","21/07/2016","22/07/2016","01/08/2016",
"02/08/2016","03/08/2016","04/08/2016","05/08/2016","06/08/2016","07/08/2016","08/08/2016","09/08/2016","10/08/2016","11/08/2016",
"12/08/2016","13/08/2016","14/08/2016","15/08/2016","16/08/2016","17/08/2016","18/08/2016","19/08/2016","20/08/2016","21/08/2016",
"22/08/2016","23/08/2016","24/08/2016","25/08/2016","26/08/2016","27/08/2016","28/08/2016","29/08/2016","30/08/2016","31/08/2016",
"01/09/2016","02/09/2016","03/09/2016","04/09/2016","05/09/2016","06/09/2016","07/09/2016","08/09/2016","09/09/2016","10/09/2016",
"11/09/2016","12/09/2016","13/09/2016","14/09/2016","15/09/2016","16/09/2016","17/09/2016","18/09/2016","19/09/2016","20/09/2016"),
wear=c("0","55","0","0","0","0","8","8","15","25","30","37","43","49","52","52","55","57","57","61","67","69","2","2","7",
"10","13","14","16","16","19","22","22","24","25","26","29","29","33","34","34","36","38","44","45","48","50","55",
"56","58","0","4","0","4","4","6","9","9","12","14","16","17","25","25","33","36","44","46","48","52","55","59",
"8","9","9","12","24","33","36","44"))
数据是机器上一种金属的磨损率的一个例子,它随着时间的推移而增加,它们下降到 0,表明一个事件或变化,
但是我遇到的问题是磨损值没有下降到0,从数据中可以看出,有2个变量
as.date = 随着时间推移的日期,wear = 随着时间推移零件上的金属磨损
变化之间的范围是:55-0、60-2、58-0、59-8
当它从一个大数字下降到 0 时很容易编码,我使用以下代码进行更改,并添加名为 Status & id 的新变量
{Creates 2 new columns status & id
prop.table(table(df$Status))
prop.table(table(df$Status),1) # creates new coulmn called status
df$Status <- 0# fills in column status with all zeros
df$Status[wear > -10 & wear == 0] <- 1 # fill in 1s when wear = 0
prop.table(table(df$Status))
prop.table(table(df$Status),1) # creates new coulmn called status
df$id <-1# fills in column status with '1's
for(i in 2:nrow(df)){
if(df$Status[i-1]==0){
df$id[i]=df$id[i-1]
}
else {
df$id[i]=df$id[i-1]+1
}
}
}
将磨损值下降到 0 可以正常工作,但如果没有,如数据示例中所示,磨损下降发生在 55-0、69-2、58-0、59-8 范围内真实数据集有时磨损值下降为负数,不确定实现这一点的正确方法,我尝试对数据进行分箱和分组,但没有成功。
这是数据的一个样本,在真实数据集中有 100 多个事件,主要是磨损值下降到 0,但有 10-20 次下降到负值或值 < 10。