假设,我有一个如下
df1 的数据框:
+------+--+------+--------+
| ID | | Type | Points |
+------+--+------+--------+
| DJ45 | | A | 69.2 |
| DJ45 | | F | 60.8 |
| DJ45 | | C | 2.9 |
| DJ46 | | B | 22.7 |
| DJ46 | | D | 18.7 |
| DJ46 | | A | 16.1 |
| DJ47 | | E | 67.2 |
| DJ47 | | C | 63.1 |
| DJ47 | | F | 16.7 |
| DJ48 | | D | 8.4 |
+------+--+------+------+
我想获得一个结果,它将以以下格式提供类型的 Top 2 值(逐点):
输出:
+------+---------+---------+
| ID | Type1 | Type2 |
+------+---------+---------+
| DJ45 | A | F |
| DJ46 | B | D |
| DJ47 | E | C |
| DJ48 | D | NA |
我用过:
df1 %>%
group_by(Id) %>%
top_n(2,wt=Points) %>%
mutate(val = paste("Type", row_number())) %>%
filter(row_number()<=2) %>%
select(-Points) %>%
spread(val, Type)
但我得到以下答案:
输出:
+------+------+--------+---------+
| ID |Points|Type1 | Type2 |
+------+------+--------+---------+
| DJ45 | 69.2 | A | NA |
| DJ45 | 60.8 | NA | F |
| DJ46 | 22.7 | B | NA |
| DJ46 | 18.7 | NA | D |
| DJ47 | 67.2 | E | NA |
| DJ47 | 63.1 | NA | C |
| DJ48 | 8.4 | D | NA |