r - R中按类别排名

Question

我有一个数据框，我想Category根据PCC.

> head(newdf)
            ItemId    Category PCC
1       5063660193 Go to Gifts   2
2   24154563660193 Go to Gifts   1
2.1 24154563660193   All Gifts   1
3   26390063660193 Go to Gifts   3
3.1 26390063660193   All Gifts   3
4         18700100 Go to Gifts   1

我最初虽然使用该包来执行此操作，但不幸的是，R 版本 3.0.2 没有sqldf依赖项 ( )。tcltk

使用sqldf类似于以下的调用应该可以完成这项工作：

# ranking by category
rank <- sqldf("select 
                 nf.ItemId,
                 nf.Category,
                 nf.PCC,
                 rank() over(Partition by nf.Category order by nf.PCC, nf.ItemId, nf.Category) as Ranks

               from 
                 newdf as nf

               order by 
                 nf.Category,
                 nf.Ranks")

你知道我可以用什么替代品吗？

score 2 · Accepted Answer

这些只是少数几种不同的方法：

dat <- read.table(text = "            ItemId    Category PCC
       5063660193 'Go to Gifts'   2
   24154563660193 'Go to Gifts'   1
 24154563660193   'All Gifts'   1
   26390063660193 'Go to Gifts'   3
 26390063660193   'All Gifts'   3
         18700100 'Go to Gifts'   1",header = TRUE,sep = "")

library(plyr)
ddply(dat,.(Category),transform, val = rank(PCC))

library(dplyr)
mutate(group_by(dat,Category),val = rank(PCC))

library(data.table)
dat1 <- data.table(dat)
setkey(dat1,Category)
dat1[,val := rank(PCC),by = key(dat1)]

另外，我可以在 R 3.0.2 上加载sqldf就好了，所以我不确定你的问题是什么。

这使用rank. 查看?rank和ties.method参数以根据您的确切需求对其进行自定义。

r - R中按类别排名

1 回答 1

Related

Reference