我正在尝试使用 R 函数 quantcut() 将数值变量重新编码为具有对应于分位数的水平的因子。例如:
> X
[1] 6 4 9 6 1 2 5 3 5 7 10 7 2 7 7 5 6 6 3 4 6 4 2 7 6 7
[27] 4 3 5 3 7 6 8 12 4 4 0 1 7 6 7 4 7 1 1 1 2 3 3 1 1 6
[53] 5 3 1 1 1 3 3 3 1 1 3 1 1 1 3 3 0 1 3 1 8 5 3 0 0 2
[79] 1 3 8 0 1 4 1 1 1 1 1 1 3 2 1 4 1 5 5 12 7 2 6 6 2 6
[105] 0 1 4 1 4 0 7 3 2 1 1 8 5 5 3 0 5 6 2 4 2 2 2 6 4 2
[131] 2 2 2 6 8 5 1 2 8 3 2 7 4 6 6 6 7 5 1 5 5 6 1 4 4 5
[157] 6 2 4 7 2 4 10 6 3 5 2 2 6 6 2 4 5 7 4 5 11 6 6 8 2 4
[183] 4 6 12 16 9 7 14 13 11 5 5 2 2 7 7 6 4 3 4 3 5 4 5 7 9 4
[209] 3 12 4 4 4 8 7 6 1 3 6 7 5 5 6 9 6 4 7 8 5 6 3 6 4 7
[235] 3 3 4 7 5 7 5 9 5 8 3 4 3 2 5 2 4 3 8 4 2 2 1 5 3 5
[261] 8 5 6 4 5 1 1 2 6 2 7 2 4 4 3 3 4 10 5 6 10 2 5 5 0 1
[287] 6 2 5 4 6 6 9 5 5 6 3 8 1 5 1 8 5 2 5 2 4 2 4 4
bins=10
labels = 1:bins
library(gtools)
x2 = quantcut(X, q = seq(0, 1, by=1/bins), labels=labels)
我收到错误消息:“cut.default 中的错误(x[!flag],breaks = newquant,include.lowest = TRUE,:'breaks' 不是唯一的”。我认为这是因为分位数中有联系,但是quantcut 的文档专门显示了一个示例,说明该函数如何通过使用更少的间隔来处理关系。无论我是否指定标签参数,都会发生错误。
任何建议将不胜感激。
编辑:这是输入变量 X 的代码:
X = c(6L, 4L, 9L, 6L, 1L, 2L, 5L, 3L, 5L, 7L, 10L, 7L, 2L, 7L, 7L,
5L, 6L, 6L, 3L, 4L, 6L, 4L, 2L, 7L, 6L, 7L, 4L, 3L, 5L, 3L, 7L,
6L, 8L, 12L, 4L, 4L, 0L, 1L, 7L, 6L, 7L, 4L, 7L, 1L, 1L, 1L,
2L, 3L, 3L, 1L, 1L, 6L, 5L, 3L, 1L, 1L, 1L, 3L, 3L, 3L, 1L, 1L,
3L, 1L, 1L, 1L, 3L, 3L, 0L, 1L, 3L, 1L, 8L, 5L, 3L, 0L, 0L, 2L,
1L, 3L, 8L, 0L, 1L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 2L, 1L, 4L,
1L, 5L, 5L, 12L, 7L, 2L, 6L, 6L, 2L, 6L, 0L, 1L, 4L, 1L, 4L,
0L, 7L, 3L, 2L, 1L, 1L, 8L, 5L, 5L, 3L, 0L, 5L, 6L, 2L, 4L, 2L,
2L, 2L, 6L, 4L, 2L, 2L, 2L, 2L, 6L, 8L, 5L, 1L, 2L, 8L, 3L, 2L,
7L, 4L, 6L, 6L, 6L, 7L, 5L, 1L, 5L, 5L, 6L, 1L, 4L, 4L, 5L, 6L,
2L, 4L, 7L, 2L, 4L, 10L, 6L, 3L, 5L, 2L, 2L, 6L, 6L, 2L, 4L,
5L, 7L, 4L, 5L, 11L, 6L, 6L, 8L, 2L, 4L, 4L, 6L, 12L, 16L, 9L,
7L, 14L, 13L, 11L, 5L, 5L, 2L, 2L, 7L, 7L, 6L, 4L, 3L, 4L, 3L,
5L, 4L, 5L, 7L, 9L, 4L, 3L, 12L, 4L, 4L, 4L, 8L, 7L, 6L, 1L,
3L, 6L, 7L, 5L, 5L, 6L, 9L, 6L, 4L, 7L, 8L, 5L, 6L, 3L, 6L, 4L,
7L, 3L, 3L, 4L, 7L, 5L, 7L, 5L, 9L, 5L, 8L, 3L, 4L, 3L, 2L, 5L,
2L, 4L, 3L, 8L, 4L, 2L, 2L, 1L, 5L, 3L, 5L, 8L, 5L, 6L, 4L, 5L,
1L, 1L, 2L, 6L, 2L, 7L, 2L, 4L, 4L, 3L, 3L, 4L, 10L, 5L, 6L,
10L, 2L, 5L, 5L, 0L, 1L, 6L, 2L, 5L, 4L, 6L, 6L, 9L, 5L, 5L,
6L, 3L, 8L, 1L, 5L, 1L, 8L, 5L, 2L, 5L, 2L, 4L, 2L, 4L, 4L)