1

我对 R 非常陌生,我正在尝试将连续变量分为两类。假设如下:

y = c(6.3, 6.2, 6.2, 5.5, 6.9, 6.8, 5.3, 5.3, 5.4, 5.2, 7.2, 7.1, 8.1, 8.2, 8.2, 7.4, 6.7, 7.2, 7.9, 8.0, 6.5, 6.6, 6.5, 7.2, 7.2, 6.8, 6.7)
cuts = cut(y, breaks=2)
cuts
[1] (5.197,6.7] (5.197,6.7] (5.197,6.7] (5.197,6.7] (6.7,8.203] (6.7,8.203] (5.197,6.7] (5.197,6.7] (5.197,6.7] (5.197,6.7] (6.7,8.203]
[12] (6.7,8.203] (6.7,8.203] (6.7,8.203] (6.7,8.203] (6.7,8.203] (6.7,8.203] (6.7,8.203] (6.7,8.203] (6.7,8.203] (5.197,6.7] (5.197,6.7]
[23] (5.197,6.7] (6.7,8.203] (6.7,8.203] (6.7,8.203] (6.7,8.203]
Levels: (5.197,6.7] (6.7,8.203]

我对出现在向量末尾的值6.7特别感兴趣。为什么 6.7 属于区间 (6.7, 8.203] 而不是 (5.197, 6.7]?据我了解,6.7 不应该属于区间 (6.7, 8.203]。我错过了什么吗?感谢您的帮助!

编辑:

正如评论中指出的6.7实际上是6.7000000000000001776

options(digits=20);
y
 [1] 6.2999999999999998224 6.2000000000000001776 6.2000000000000001776 5.5000000000000000000 6.9000000000000003553 6.7999999999999998224
 [7] 5.2999999999999998224 5.2999999999999998224 5.4000000000000003553 5.2000000000000001776 7.2000000000000001776 7.0999999999999996447
[13] 8.0999999999999996447 8.1999999999999992895 8.1999999999999992895 7.4000000000000003553 6.7000000000000001776 7.2000000000000001776
[19] 7.9000000000000003553 8.0000000000000000000 6.5000000000000000000 6.5999999999999996447 6.5000000000000000000 7.2000000000000001776
[25] 7.2000000000000001776 6.7999999999999998224 6.7000000000000001776

另一个问题:

我将保存区间范围以供以后参考,因为我想检查新元素落入哪个区间。所以想象我有(5.197,6.7] (6.7,8.203]cut 生成的间隔,现在我将得到一个新元素x = 6.7,我想检查它会落在哪个间隔。当我检查5.197 < x <= 6.7它是否会落入第一个区间,而我原来的向量 6.7 落入第二个区间时。

cuts = cut(y, breaks=2, dig.lab=17)真的是我让两个元素进入相同间隔的方法吗?

4

0 回答 0