我正在尝试编写一个tidyverse/dplyr
我想最终与lapply
(或map
)一起使用的函数。(我一直在努力回答这个问题,但遇到了一个有趣的结果/死胡同。请不要将此标记为重复 - 这个问题是您在那里看到的答案的扩展/背离。)
是否有
1) 一种方法来获取带引号的变量列表以在 dplyr 函数中工作
(而不使用已弃用的SE_
函数) ,或者是否有
2) 某种方法通过lapply
or提供未引用的字符串列表map
我已经使用Programming in Dplyr
小插图构建了我认为最符合当前使用 NSE 标准的功能。
样本数据:
sample_data <-
read.table(text = "REVENUEID AMOUNT YEAR REPORT_CODE PAYMENT_METHOD INBOUND_CHANNEL AMOUNT_CAT
1 rev-24985629 30 FY18 S Check Mail 25,50
2 rev-22812413 1 FY16 Q Other Canvassing 0.01,10
3 rev-23508794 100 FY17 Q Credit_card Web 100,250
4 rev-23506121 300 FY17 S Credit_card Mail 250,500
5 rev-23550444 100 FY17 S Credit_card Web 100,250
6 rev-21508672 25 FY14 J Check Mail 25,50
7 rev-24981769 500 FY18 S Credit_card Web 500,1e+03
8 rev-23503684 50 FY17 R Check Mail 50,75
9 rev-24982087 25 FY18 R Check Mail 25,50
10 rev-24979834 50 FY18 R Credit_card Web 50,75
", header = TRUE, stringsAsFactors = FALSE)
报表生成功能
report <- function(report_cat){
report_cat <- enquo(report_cat)
sample_data %>%
group_by(!!report_cat, YEAR) %>%
summarize(num=n(),total=sum(AMOUNT)) %>%
rename(REPORT_VALUE = !!report_cat) %>%
mutate(REPORT_CATEGORY := as.character(quote(!!report_cat))[2])
}
这适用于生成单个报告:
> report(REPORT_CODE) # A tibble: 7 x 5 # Groups: REPORT_VALUE [4] REPORT_VALUE YEAR num total REPORT_CATEGORY <chr> <chr> <int> <int> <chr> 1 J FY14 1 25 REPORT_CODE 2 Q FY16 1 1 REPORT_CODE 3 Q FY17 1 100 REPORT_CODE 4 R FY17 1 50 REPORT_CODE 5 R FY18 2 75 REPORT_CODE 6 S FY17 2 400 REPORT_CODE 7 S FY18 2 530 REPORT_CODE
当我尝试设置要生成的所有 4 个报告的列表时,一切都崩溃了。(诚然,函数最后一行所需的代码——返回一个字符串,然后用它填充列——应该足够线索,我已经走错了方向。)
#the other reports
cat.list <- c("REPORT_CODE","PAYMENT_METHOD","INBOUND_CHANNEL","AMOUNT_CAT")
# Applying and Mapping attempts
lapply(cat.list, report)
map_df(cat.list, report)
结果是:
> lapply(cat.list, report) Error in (function (x, strict = TRUE) : the argument has already been evaluated > map_df(cat.list, report) Error in (function (x, strict = TRUE) : the argument has already been evaluated
我还尝试将字符串列表转换为名称,然后再将其交给apply
and map
:
library(rlang)
cat.names <- lapply(cat.list, sym)
lapply(cat.names, report)
map_df(cat.names, report)
> lapply(cat.names, report) Error in (function (x, strict = TRUE) : the argument has already been evaluated > map_df(cat.names, report) Error in (function (x, strict = TRUE) : the argument has already been evaluated
在任何情况下,我问这个问题的原因是我认为我已经按照当前记录的标准编写了该功能,但最终我看不出有办法利用这个功能apply
的家庭成员甚至家庭成员purrr::map
. names
没有像userR在这里所做的那样重写要使用的函数https://stackoverflow.com/a/47316151/5088194有没有办法让这个函数使用apply
或map
?
我希望看到这个结果:
# A tibble: 27 x 5 # Groups: REPORT_VALUE [16] REPORT_VALUE YEAR num total REPORT_CATEGORY <chr> <chr> <int> <int> <chr> 1 J FY14 1 25 REPORT_CODE 2 Q FY16 1 1 REPORT_CODE 3 Q FY17 1 100 REPORT_CODE 4 R FY17 1 50 REPORT_CODE 5 R FY18 2 75 REPORT_CODE 6 S FY17 2 400 REPORT_CODE 7 S FY18 2 530 REPORT_CODE 8 Check FY14 1 25 PAYMENT_METHOD 9 Check FY17 1 50 PAYMENT_METHOD 10 Check FY18 2 55 PAYMENT_METHOD # ... with 17 more rows