r - Naming a dataframe like the path

Question

I have a lot of CSV that need to be standardized. I created a dictionary for doing so and so far the function that I have looks like this:

inputpath <- ("input")

files<- paste0(inputpath, "/", 
                 list.files(path = inputpath, pattern = '*.gz',
                            full.names = FALSE))

standardizefunctiontofiles = lapply(files, function(x){
    DF <- read_delim(x, delim = "|",  na="")
    names(DF) <- dictionary$final_name[match(names(DF), dictionary$old_name)]
})

Nonetheless, the issue that I have is that when I read the CSV and turn them into a dataframe they lose their path and therefore I can't not write each of them as a CSV that matches the input name. What I would normally do would be:

output_name <- str_replace(x, "input", "output")
write_delim(x, "output_name", delim = "|")

I was thinking that a way of solving this would be to make this step:

DF <- read_delim(x, delim = "|",  na="")

so that the DF gets the name of the path but I haven't find any solution for that.

Any ideas on how to solve this issue for being able to apply a function and writing each of them as a standardized CSV?

score 0 · Accepted Answer

我不完全理解这个问题。但据我了解，您想用包含修改（和正确）数据框信息的新 CSV 文件覆盖正在读取的 CSV 文件。

我认为你有两种选择

选项 1) 读取数据时，将 CSV 存储为数据框并将路径存储为列表中的字符串。

这就像

file_list <- list()

for (i in seq_along(files)) {
  file_list[[i]] <- list(df = read_delim(files[[i]], delim = "|",  na = ""),
                         path = files[[i]])
}

然后，当您编写更正的数据帧时，您可以使用 list 中列表的第二个元素中的路径file_list。请注意，为了将路径作为字符串获取，您需要执行类似的操作file_list[[1]][["path"]]

选项 2) 使用assign

for (i in seq_along(files)) {
   assign(files[[i]], read_delim(files[[i]], delim = "|",  na = ""))
}

选项 3) 使用并且是一个函数do.call的事实！<-

for (i in seq_along(files)) {
   do.call("<-", list(files[[i]], read_delim(files[[i]], delim = "|",  na = "")))
}

我希望这是有用的！

NB）没有一个功能尽可能有效地实现。他们只是介绍这个想法。

r - Naming a dataframe like the path

1 回答 1

Related

Reference