文档中的以下示例tidyr::expand()
在本地内存中的 data.frame 上按预期工作。tbl_dbi
当在单个列上调用时,相同的示例适用于对象,但在应用于多个列时显示 SQL 解析错误:
library(tidyverse, quietly = TRUE)
library(duckdb, quietly = TRUE)
fruits <- tibble(
type = c("apple", "orange", "apple", "orange", "orange", "orange"),
year = c(2010, 2010, 2012, 2010, 2010, 2012),
size = factor(
c("XS", "S", "M", "S", "S", "M"),
levels = c("XS", "S", "M", "L")
),
weights = rnorm(6, as.numeric(size) + 2)
)
fruits %>% tidyr::expand(type, size)
#> # A tibble: 8 × 2
#> type size
#> <chr> <fct>
#> 1 apple XS
#> 2 apple S
#> 3 apple M
#> 4 apple L
#> 5 orange XS
#> 6 orange S
#> 7 orange M
#> 8 orange L
## now create a remote tbl
con <- dbConnect(duckdb::duckdb())
dbWriteTable(con, "fruits", fruits)
fruits_db <- tbl(con, "fruits")
## same command works with one column:
fruits_db %>% tidyr::expand(type)
#> # Source: lazy query [?? x 1]
#> # Database: duckdb_connection
#> type
#> <chr>
#> 1 apple
#> 2 orange
## but fails with two:
fruits_db %>% tidyr::expand(type, size)
#> Error in .local(conn, statement, ...): duckdb_prepare_R: Failed to prepare query SELECT *
#> FROM (SELECT "type", "size"
#> FROM (SELECT DISTINCT "type"
#> FROM "fruits") "LHS"
#> LEFT JOIN (SELECT DISTINCT "size"
#> FROM "fruits") "RHS"
#> ) "q01"
#> LIMIT 11
#> Error: Parser Error: syntax error at or near ")"
#> LINE 7: ) "q01"
#> ^
由reprex 包于 2021-10-01 创建(v2.0.1)
为什么会发生这种情况,我们该如何避免呢?(在我看来,tidyr 翻译生成的 SQL 末尾附加了一个非常虚假的“q01”,但不知道为什么会出现这种情况以及为什么它只会在两列展开情况下发生)。
我怀疑这是一个错误,但在这种情况下,我不确定如何更好地查明错误的根源——即问题是由于 , , 或翻译的其他一些组件中的错误引起duckdb
的dbplyr
吗tidyr
?