r - read_sas cols_only 抑制错误评估错误：必须命名第 2 列

Question

我有一长串非常大的 SAS 文件。我想使用 read_sas 导入它们。为了提高速度并减少内存使用量，我只想使用 cols_only 导入我感兴趣的列。

问题是，我有一长串可能的列名 - 但并非每一列都在我的数据集中。如果我将完整列表传递给 cols_only，则会收到错误消息：

Evaluation error: Column 2 must be named.

有没有办法抑制这个错误，并鼓励 read_sas 尽最大努力从我通过的列表中导入任何变量？

score 4 · Accepted Answer

正如@Andrew 在他们的评论中提到的那样，使用 Haven >= 2.2.0 您可以col_select为此使用新参数。要选择可能不存在的列，请使用帮助器one_of()：

library(haven)
library(tidyselect)

f <- tempfile()
write_sas(mtcars, f)

my_cols <- c("mpg", "i-don't-exist")
read_sas(f, col_select = one_of(my_cols))
#> Warning: Unknown columns: `i-don't-exist`
#> # A tibble: 32 x 1
#>      mpg
#>    <dbl>
#>  1  21  
#>  2  21  
#>  3  22.8
#>  4  21.4
#>  5  18.7
#>  6  18.1
#>  7  14.3
#>  8  24.4
#>  9  22.8
#> 10  19.2
#> # ... with 22 more rows

r - read_sas cols_only 抑制错误评估错误：必须命名第 2 列

1 回答 1

Related

Reference