2

我正在尝试通过管道传递到filter()命令来过滤数据库支持的 tibble,并且正在观察意外行为:

如果我使用 过滤filter(pos == variable),无论我为变量分配什么值,我都会得到相同的结果。但是,使用值进行过滤,例如filter(pos == 12345)有效 - 对于我过滤的每个不同值,结果都会发生变化。

这是惰性评估还是 tidyeval 的一个方面?filter()使用变量的 DB 支持的 tibble的正确方法是什么?

这是一个可重现的示例:

library(dplyr)

con <- DBI::dbConnect(RSQLite::SQLite(), path = ":dbname:")

ex_data <- tibble(
  pos = c(10510138, 10510507),
  ref = c("CATA", "TCA"),
  alt = c("C", "T")
)

copy_to(con, ex_data, "variants", temporary = FALSE)

toQueryDB <- tbl(con, "variants")

pos = 10510138
(result <- toQueryDB %>% filter(pos == pos)  %>% select(pos, ref, alt) %>% head(1))
# 10510138 CATA  C 

pos = 10510507
(result <- toQueryDB %>% filter(pos == pos)  %>% select(pos, ref, alt) %>% head(1))
# STILL 10510138 CATA  C !!!

(result <- toQueryDB %>% filter(pos == 10510138)  %>% select(pos, ref, alt) %>% head(1))
# 10510138 CATA  C

(result <- toQueryDB %>% filter(pos == 10510507)  %>% select(pos, ref, alt) %>% head(1))
# 10510507 TCA   T 

DBI::dbDisconnect(con)

我的会话信息:

> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dplyr_0.7.6

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.18     rstudioapi_0.7   bindr_0.1.1      magrittr_1.5     tidyselect_0.2.4 bit_1.1-14       R6_2.2.2        
 [8] rlang_0.2.2      fansi_0.3.0      blob_1.1.1       tools_3.5.1      utf8_1.1.4       cli_1.0.1        DBI_1.0.0       
[15] dbplyr_1.2.2     yaml_2.2.0       bit64_0.9-7      assertthat_0.2.0 digest_0.6.17    tibble_1.4.2     crayon_1.3.4    
[22] bindrcpp_0.2.2   purrr_0.2.5      memoise_1.1.0    glue_1.3.0       RSQLite_2.1.1    compiler_3.5.1   pillar_1.3.0    
[29] pkgconfig_2.0.2 
4

1 回答 1

3

当你有

filter(pos == pos)

dplyr 不会试图找出哪个pos是哪个,它假设它们都来自通过管道传输到函数中的数据。如果要从传入的数据之外注入变量的值,则需要使用 bang-bang rlang 运算符 ( !!)。你应该使用

filter(pos == !!pos)
于 2018-10-01T20:25:32.910 回答