r - 读取路径未知的 .csv 文件——R

Question

我知道这可能是一个非常愚蠢的问题，但我一直在这上面花了几个小时

想要读取我没有完整路径 (*/*data.csv) 的 .csv 文件。我知道下面会得到当前目录的路径但不知道如何适应

Marks <- read.csv(dir(path = '.', full.names=T, pattern='^data.*\\.csv'))

这个也试过了，但没用

Marks <- read.csv(file = "*/*/data.csv", sep = ",", header=FALSE))

我无法识别特定路径，因为这将在具有不同路径的不同机器上使用，但我确信主目录的子文件夹是 bash 脚本的结果

我打算从定义工作空间的unix中调用它

我的数据结构是

lecture01/test/data.csv
lecture02/test/data.csv
lecture03/test/data.csv

score 2 · Accepted Answer

您的评论（尽管目前不是您的问题本身）表明您希望在包含一些子目录（lecture01、lecture02 等）的工作目录中运行代码，每个子目录都包含一个子目录“marks”，而这些子目录又包含一个 data.csv 文件。如果是这样，并且您的目标是从每个子目录中读取 csv，那么您有几个选项，具体取决于剩余的详细信息。

案例 1：直接指定顶级目录名称，如果您都知道它们并且它们可能是特殊的：

dirs <- c("lecture01", "lecture02", "some_other_dir")
paths <- file.path(dirs, "marks/data.csv")

案例 2：构建顶级目录名称，例如，如果它们都以“lecture”开头，后跟一个两位数，您可以（或特别希望）指定一个数字范围，例如 01 到 15：

dirs <- sprintf("lecture%02s", 1:15)
paths <- file.path(dirs, "marks/data.csv")

Case 3: Determine the top-level directory names by matching a pattern, e.g. if you want to read data from within every directory starting with the string "lecture":

matched.names <- list.files(".", pattern="^lecture")
dirs <- matched.names[file.info(matched.names)$isdir]
paths <- file.path(dirs, "marks/data.csv")

Once you have a vector of the paths, I'd probably use lapply to read the data into a list for further processing, naming each one with the base directory name:

csv.data <- lapply(paths, read.csv)
names(csv.data) <- dirs

Alternatively, if whatever processing you do on each individual CSV is done just for its side effects, such as modifying the data and writing out a new version, and especially if you don't ever want all of them to be in memory at the same time, then use a loop.

If this answer misses the mark, of even if it doesn't, it would be great if you could clarify the question accordingly.

score 0 · Accepted Answer

我没有代码，但我会从根目录做一个隐蔽的 glob 并做一个 preg_match 来查找 .csv 文件（使用 glob 大括号）。

r - 读取路径未知的 .csv 文件——R

2 回答 2

Related

Reference