r - 有没有办法通过 R 中的 read_excel(readxl) 中的列名来分配 col_types

Question

我的应用程序正在使用包的功能读取xls和xlsx文件。read_excelreadxl

xls在读取orxlsx文件时，之前不知道列的顺序和确切数量。有15 列预定义列，其中10 列是必需的，其余5 列是可选的。因此该文件将始终具有最少 10列和最多 15列。

我需要指定col-types强制的 10 列。我能想到的唯一方法是使用列名来指定，col_types因为我知道该文件有所有 10 列都是强制性的，但它们是随机序列。

我试图寻找这样做的方式，但没有这样做。

谁能帮我找到一种按列名分配 col_types 的方法？

score 1 · Accepted Answer

我通过以下解决方法解决了这个问题。虽然这不是解决这个问题的最佳方法。我已经阅读了 excel 文件两次，如果该文件的数据量很大，这将对性能产生影响。

首先阅读： 构建列数据类型向量- 读取文件以检索列信息（如列名、列数及其类型）并构建文件中的每一列都column_data_types vector将包含的文件。datatype

#reading .xlsx file
site_data_columns <- read_excel(paste(File$datapath, ".xlsx", sep = ""))

site_data_column_names <- colnames(site_data_columns)

for(i in 1 : length(site_data_column_names)){  

    #where date is a column name
    if(site_data_column_names[i] == "date"){
         column_data_types[i] <- "date"

         #where result is a column name
         } else if (site_data_column_names[i] == "result") {
                      column_data_types[i] <- "numeric"

         } else{
                column_data_types[i] <- "text"
        }
}

第二次读取： 读取文件内容- 通过提供具有列col_types的参数来读取 excel 文件。vector column_data_typesdata types

#reading .xlsx file
site_data <- read_excel(paste(File$datapath, ".xlsx", sep = ""), col_types = column_data_types)

r - 有没有办法通过 R 中的 read_excel(readxl) 中的列名来分配 col_types

1 回答 1

Related

Reference