0

我正在将多个 excel 文件与一对多的工作表结合起来。它们每个都有不同的列。我只对将工作表与地址信息结合起来感兴趣。对于没有地址信息的工作表,我需要在生成的组合文件中记下它。如果我遇到问题,其中一个工作表有蔬菜但没有地址,另一个有地址信息..我正在使用下面的代码将它们放在一起。在我开始工作后,我会将它们标准化并完全放置。

dir_path <- "C:/temp/ConsigneeList/stuff4/"         # target directory where the xlsx files are located. 
re_file <- list.files(dir_path, pattern=".xls*")    # regex pattern to match the file name format, in this case 'test1.xlsx', 'test2.xlsx' etc.

read_sheets <- function(dir_path, file){
  xls_file <- paste0(dir_path, file)
  xls_file %>%
    excel_sheets() %>%
    set_names() %>%
    map_df(read_excel, path = xls_file, .id = 'sheet_name') %>% 
    mutate(file_name = file) %>% 
    select(file_name, sheet_name, everything())
}

number_of_excel_files<-length(file.list)
mybiggerlist<-vector('list',number_of_excel_files)
for(file in 1:length(mybiggerlist)) {

  mybiggerlist[[file]]<-read_sheets(dir_path, file.list[file])
  
}

我收到错误消息:错误:无法组合Customer Quick REF$Order NoCH Belt$Order No . 我尝试使用 %>% mutate_all(as.character) 因为列本质上应该都是字符。关于如何解决这个问题的任何想法?或者,有没有办法跳过导入有问题的数据并在一行中显示该工作表存在问题?谢谢!

4

1 回答 1

1

尝试这样的事情:

dir_path <- "C:/temp/ConsigneeList/stuff4/"         # target directory where the xlsx files are located. 
re_file <- list.files(dir_path, pattern=".xls*")    # regex pattern to match the file name format, in this case 'test1.xlsx', 'test2.xlsx' etc.

read_sheets <- function(dir_path, file){
  xls_file <- paste0(dir_path, file)
  sheets <- xls_file %>%
    excel_sheets() %>%
    set_names() %>% ## not really sure if this is doing anything?
    map(read_excel, path = xls_file)
    
    # Now we have all the sheets in a list. 
    # Time to figure out which ones to combine
    # Use purrr::keep to only keep sheets that meet some condition
    # I just put in a wild guess, edit the test so that only sheets
    # you want are kept
    sheets <- purrr::keep(sheets, ~ "Address" %in% names(.))    
    
    bind_rows(sheets, .id = 'sheet_name') %>%
      mutate(file_name = file) %>% 
      select(file_name, sheet_name, everything())
}
于 2021-11-16T19:49:21.447 回答