-1

我的数据集是 y。我有一个 ID 和 Sales 列。我想添加一个 3 列,其中包含每个员工基于其销售额的百分位数。

百分位数的公式是:

Percentile Employee(i) = (Number of employees with less sales)/(Total employees-1)

谢谢

4

2 回答 2

3

使用您的公式,考虑以下假数据解决方案:

#fake data
y <- data.frame(
            #20 fake ids
            id = seq(1,20),
            #20 fake sales between 10000 and 15000  
            sales = runif(20, 10000, 15000))

#define an employee count
emp_cnt <- length(y$id)
#rank your sales
y$rank <- rank(y$sales,ties.method="min")
#subtract each rank from one (i.e. lowest rank) and divide by one minus emp_cnt
y$percentile <- (y$rank - 1)/(emp_cnt - 1)
于 2013-09-14T22:14:10.377 回答
0

用这个:

within(y[order(y$sales), ], p <- with(rle(sales), rep(c(0, head(cumsum(lengths), -1)), lengths))/(length(ID)-1))

示例输出:

   ID sales         p
4   4     3 0.0000000
6   6     3 0.0000000
11 11     3 0.0000000
19 19     3 0.0000000
20 20     3 0.0000000
3   3     4 0.2631579
13 13     4 0.2631579
17 17     4 0.2631579
18 18     4 0.2631579
2   2     5 0.4736842
8   8     5 0.4736842
10 10     5 0.4736842
12 12     5 0.4736842
16 16     5 0.4736842
9   9     6 0.7368421
5   5     7 0.7894737
7   7     7 0.7894737
15 15     7 0.7894737
1   1     8 0.9473684
14 14     8 0.9473684

使用的数据:

   ID sales
1   1     8
2   2     5
3   3     4
4   4     3
5   5     7
6   6     3
7   7     7
8   8     5
9   9     6
10 10     5
11 11     3
12 12     5
13 13     4
14 14     8
15 15     7
16 16     5
17 17     4
18 18     4
19 19     3
20 20     3
于 2013-09-14T22:29:22.490 回答