我想以某种方式对数据框进行分类R
。
假设有如下数据框:
> data = sample(1:500, 5000, replace = TRUE)
为了对这个数据框进行分类,我正在制作这些类:
> data.cl = cut(data, breaks = c(seq(0,100,by=10), 200, 350, 480, 500))
> table(data.cl)
data.cl
(0,10] (10,20] (20,30] (30,40] (40,50]
102 80 87 113 117
(50,60] (60,70] (70,80] (80,90] (90,100]
101 89 95 106 104
(100,200] (200,350] (350,480] (480,500]
1002 1492 1318 194
如果我想0
被包括在内,我只需要添加include.lowest = TRUE
:
> data.cl = cut(data, breaks = c(seq(0,100,by=10), 200, 350, 480, 500),
+ include.lowest = TRUE)
> table(data.cl)
data.cl
[0,10] (10,20] (20,30] (30,40] (40,50]
102 80 87 113 117
(50,60] (60,70] (70,80] (80,90] (90,100]
101 89 95 106 104
(100,200] (200,350] (350,480] (480,500]
1002 1492 1318 194
在此示例中,这没有显示任何差异,因为0
此数据帧中根本没有出现。但是,如果它会,例如,在 class 中4
会有元素106
而不是元素:102
[0,10]
> data.cl = cut(data, breaks = c(seq(0,100,by=10), 200, 350, 480, 500),
+ include.lowest = TRUE)
> table(data.cl)
data.cl
[0,10] (10,20] (20,30] (30,40] (40,50]
106 80 87 113 117
(50,60] (60,70] (70,80] (80,90] (90,100]
101 89 95 106 104
(100,200] (200,350] (350,480] (480,500]
1002 1492 1318 194
更改班级限制还有另一种选择。的默认选项cut()
是right = FALSE
。如果你改变它,right = TRUE
你会得到:
> data.cl = cut(data, breaks = c(seq(0,100,by=10), 200, 350, 480, 500),
+ include.lowest = TRUE, right = FALSE)
> table(data.cl)
data.cl
[0,10) [10,20) [20,30) [30,40) [40,50)
92 81 87 111 118
[50,60) [60,70) [70,80) [80,90) [90,100)
103 89 94 103 103
[100,200) [200,350) [350,480) [480,500]
1003 1497 1320 199
include.lowest
现在变为“<code>include.highest”,代价是更改类限制,因此在某些类中返回不同数量的类成员,因为类限制略有变化。
但是如果我想要数据框
> data.cl = cut(data, breaks = c(seq(0,100,by=10), 200, 350, 480, 500))
> table(data.cl)
data.cl
(0,10] (10,20] (20,30] (30,40] (40,50]
102 80 87 113 117
(50,60] (60,70] (70,80] (80,90] (90,100]
101 89 95 106 104
(100,200] (200,350] (350,480] (480,500)
1002 1492 1318 194
也排除 500
,我该怎么办?
当然,人们可以说:“只写data.cl = cut(data, breaks = c(seq(0,100,by=10), 200, 350, 480, 499))
而不是data.cl = cut(data, breaks = c(seq(0,100,by=10), 200, 350, 480, 500))
,因为您正在处理整数。”<br> 没错,但如果不是这种情况,我会使用浮点数来代替? 那我怎么排除500
呢?