我正在尝试为summarise()
任意组的任意变量编写一个简单的包装器,并且已经取得了进展,现在我已经加载了正确的库版本,但是(再次)对如何取消引用具有多个值的参数感到困惑。
我目前有以下功能...
table_summary <- function(df = .,
id = individual_id,
select = c(),
group = site,
...){
## Quote all arguments (see http://dplyr.tidyverse.org/articles/programming.html)
quo_id <- enquo(id)
quo_select <- enquo(select)
quo_group <- enquo(group)
## Subset the data
df <- df %>%
dplyr::select(!!quo_id, !!quo_select, !!quo_group) %>%
unique()
## gather() data, just in case there is > 1 variable selected to be summarised
df <- df %>%
gather(key = variable, value = value, !!quo_select)
## Summarise selected variables by specified groups
results <- df %>%
group_by(!!quo_group, variable) %>%
summarise(n = n(),
mean = mean(value, na.rm = TRUE))
return(results)
}
如果我指定一个分组变量,它会得到大部分的方式并且可以工作......
> table_summary(df = mtcars, id = model, select = c(mpg), group = gear)
# A tibble: 3 x 4
# Groups: c(gear) [?]
gear variable n mean
<dbl> <chr> <int> <dbl>
1 3 mpg 15 16.10667
2 4 mpg 12 24.53333
3 5 mpg 5 21.38000
group_by(!!quo_group, variable)
...但是当我指定多个时失败group = c(gear, hp)
...
> mtcars$model <- rownames(mtcars)
> table_summary(df = mtcars, id = model, select = c(mpg), group = c(gear, hp))
Error in mutate_impl(.data, dots) :
Column `c(gear, hp)` must be length 32 (the group size) or one, not 64
我回去重新阅读了编程 dplyr 文档,我读到您可以使用而不是捕获多个变量,然后使用取消引用拼接它们,所以尝试了......quos()
enquo()
!!!
table_summary <- function(df = .,
id = individual_id,
select = c(),
group = c(),
digits = 3,
...){
## Quote all arguments (see http://dplyr.tidyverse.org/articles/programming.html)
quo_id <- enquo(id)
quo_select <- enquo(select)
quo_group <- quos(group) ## Use quos() rather than enquo()
UQS(quo_group) %>% print() ## Check to see what quo_group holds
## Subset the data
df <- df %>%
dplyr::select(!!quo_id, !!quo_select, !!!quo_group)) %>%
unique()
## gather() data, just in case there is > 1 variable selected to be summarised
df <- df %>%
gather(key = variable, value = value, !!quo_select)
## Summarise selected variables by specified groups
results <- df %>%
group_by(!!!quo_group, variable) %>%
summarise(n = n(),
mean = mean(value, na.rm = TRUE))
return(results)
}
...现在第一次引用!!!quo_group``within
dplyr::select() regardless of how many variables are specified under
group = `...
> table_summary(df = mtcars, id = model, select = c(mpg), group = c(gear))
[[1]]
<quosure: frame>
~group
attr(,"class")
[1] "quosures"
Error in overscope_eval_next(overscope, expr) : object 'gear' not found
> traceback()
17: .Call(rlang_eval, f_rhs(quo), overscope)
16: overscope_eval_next(overscope, expr)
15: FUN(X[[i]], ...)
14: lapply(.x, .f, ...)
13: map(.x[matches], .f, ...)
12: map_if(ind_list, !is_helper, eval_tidy, data = names_list)
11: select_vars(names(.data), !(!(!quos(...))))
10: select.data.frame(., !(!quo_id), !(!quo_select), !(!(!quo_group)))
9: dplyr::select(., !(!quo_id), !(!quo_select), !(!(!quo_group)))
8: function_list[[i]](value)
7: freduce(value, `_function_list`)
6: `_fseq`(`_lhs`)
5: eval(quote(`_fseq`(`_lhs`)), env, env)
4: eval(quote(`_fseq`(`_lhs`)), env, env)
3: withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
2: df %>% dplyr::select(!(!quo_id), !(!quo_select), !(!(!quo_group))) %>%
unique()
1: table_summary(df = mtcars, id = model, select = c(mpg), group = c(gear))
看起来很奇怪,我认为问题的根源是!!!quo_group
(ie UQS(quo_group)
) 打印出来~gear
而不是 quosures 列表,因为将 a 添加print()
到工作示例中显示发生......
> my_summarise <- function(df, ...) {
group_by <- quos(...)
UQS(group_by) %>% print()
df %>%
group_by(!!!group_by) %>%
summarise(a = mean(a))
}
> df <- tibble(
g1 = c(1, 1, 2, 2, 2),
g2 = c(1, 2, 1, 2, 1),
a = sample(5),
b = sample(5)
)
> my_summarise(df, g1, g2)
[[1]]
<quosure: global>
~g1
[[2]]
<quosure: global>
~g2
attr(,"class")
[1] "quosures"
# A tibble: 4 x 3
# Groups: g1 [?]
g1 g2 a
<dbl> <dbl> <dbl>
1 1 1 1.0
2 1 2 5.0
3 2 1 2.5
4 2 2 4.0
我想明确地提供我希望分组的变量作为我的参数的参数,但是如果我将它们指定为,它是否有效,...
但我决定在提供分组变量时测试我的函数是否有效...
table_summary <- function(df = .,
id = individual_id,
select = c(),
group = c(),
digits = 3,
...){
## Quote all arguments (see http://dplyr.tidyverse.org/articles/programming.html)
quo_id <- enquo(id)
quo_select <- enquo(select)
## quo_group <- quos(group)
quo_group <- quos(...)
UQS(quo_group) %>% print()
## Subset the data
df <- df %>%
dplyr::select(!!quo_id, !!quo_select, !!!quo_group) %>%
unique()
## gather() data, just in case there is > 1 variable selected to be summarised
df <- df %>%
gather(key = variable, value = value, !!quo_select)
## Summarise selected variables by specified groups
results <- df %>%
group_by(!!!quo_group, variable) %>%
summarise(n = n(),
mean = mean(value, na.rm = TRUE))
return(results)
}
...但它没有,quos()
再次取消引用拼接,NULL
因此变量既不会被选择也不会被...分组
> table_summary(df = mtcars, id = model, select = c(mpg), gear, hp)
NULL
# A tibble: 1 x 3
variable n mean
<chr> <int> <dbl>
1 mpg 32 20.09062
> table_summary(df = mtcars, id = model, select = c(mpg), gear)
NULL
# A tibble: 1 x 3
variable n mean
<chr> <int> <dbl>
1 mpg 32 20.09062
我已经经历了几次这个周期,现在检查了每种使用方法enquo()
,quos()
但看不到我哪里出错了,尽管已经多次阅读了编程 dplyr 文档。