0

假设我想使用按医院和县显示程序价格的医院医疗保险数据,我的数据框被称为 df,列有价格、程序和县。如果我想按县查找每个程序的最低和最高价格,我可以这样

library(plyr)
mostexpensive <- ddply(df,c('county','procedure'),function(x)x[which(x$price==max(x$price)),])

获得一张表格,显示每个县中手术费用最高的医院。然后我可以看到每家医院被列出了多少次

summary(mostexpensive$hospital)

对于最后一步,我想在原始 df 数据框中添加一列,如果该行最昂贵,则为 TRUE,否则为 FALSE,但我无法弄清楚如何从 plyr 函数中获取逻辑向量。谢谢。

4

2 回答 2

3

Posting reproducible code would be useful. Try this anyway,

For the summary

pricey <- ddply(df, c('county','procedure'), summarise, most = max(price), less=min(price))

and for the logical indexing

testing <- ddply(df, c('county','procedure'), mutate, expensive = price == max(price))
于 2013-05-21T00:18:38.670 回答
1

It will be more easier to get an answer with a reproductible example. You should think about it, next time you as for help in SO.

That being said, you can use the transform function to add a new column to your existing data.

The first step is to create a toy data set.

set.seed(123)
df <- data.frame(
    county = sample(LETTERS[1:3], size = 20, replace = TRUE),
    procedure = sample(c(1, 2), size = 20, replace = TRUE),
    price = rpois(20, 10)
)

str(df)
## 'data.frame':    20 obs. of  3 variables:
##  $ county   : Factor w/ 3 levels "A","B","C": 1 3 2 3 3 1 2 3 2 2 ...
##  $ procedure: num  2 2 2 2 2 2 2 2 1 1 ...
##  $ price    : int  6 8 6 8 4 6 6 8 5 12 ...

Now we can use plyr and the transform function

require(plyr)
expensive <- ddply(df, .(county, procedure),
                   transform, ismax = price == max(price))


expensive
##    county procedure price ismax
## 1       A         1     9 FALSE
## 2       A         1     7 FALSE
## 3       A         1    12  TRUE
## 4       A         2     6 FALSE
## 5       A         2     6 FALSE
## 6       A         2     8  TRUE
## 7       B         1     5 FALSE
## 8       B         1    12  TRUE
## 9       B         2     6 FALSE
## 10      B         2     6 FALSE
## 11      B         2    12  TRUE
## 12      B         2    11 FALSE
## 13      C         1     9  TRUE
## 14      C         1     9  TRUE
## 15      C         2     8 FALSE
## 16      C         2     8 FALSE
## 17      C         2     4 FALSE
## 18      C         2     8 FALSE
## 19      C         2    12  TRUE
## 20      C         2    12  TRUE
于 2013-05-21T00:19:08.177 回答