9

我有一个关于list.files()函数的简单问题。我有一个文件夹,其中包含以这种方式命名的文件列表:

DF2.txt
DF3.txt
DF4.txt
DF5.txt
………………
_

当我粘贴以下字符串时,

files <- list.files(pattern = ".txt")

向量按以下顺序返回值:

“DF10.txt”
“DF11.txt”
“DF12.txt”
.......... “DF2.txt”
“ DF20.txt” “DF21.txt” .... ............ “DF3.txt” “ DF30.txt” “ DF31.txt ” ..................









等等。我想以数字递增的顺序列出文件,因为它们出现在文件夹中。为什么 R 在之后更改文件夹中文件的顺序,list.files()如何重新排列这些文件以匹配原始顺序?

4

4 回答 4

28

就计算机而言,它正确分类的。但是,您可以mixedsort从“gtools”包中获得所需的排序类型:

> myFiles <- paste("file", 1:20, ".txt", sep = "")
> sort(myFiles)
 [1] "file10.txt" "file11.txt" "file12.txt" "file13.txt" "file14.txt" "file15.txt"
 [7] "file16.txt" "file17.txt" "file18.txt" "file19.txt" "file1.txt"  "file20.txt"
[13] "file2.txt"  "file3.txt"  "file4.txt"  "file5.txt"  "file6.txt"  "file7.txt" 
[19] "file8.txt"  "file9.txt" 
> library(gtools)
> mixedsort(sort(myFiles))
 [1] "file1.txt"  "file2.txt"  "file3.txt"  "file4.txt"  "file5.txt"  "file6.txt" 
 [7] "file7.txt"  "file8.txt"  "file9.txt"  "file10.txt" "file11.txt" "file12.txt"
[13] "file13.txt" "file14.txt" "file15.txt" "file16.txt" "file17.txt" "file18.txt"
[19] "file19.txt" "file20.txt"

以您的示例为例,这意味着您可以执行以下操作:

files <- list.files(pattern = ".txt")
library(gtools)
files <- mixedsort(files)

用户功能很有趣

由于编写小实用函数很容易,因此您也可以编写这样的小函数:

ListFiles <- function(pattern = ".txt") {
  require(gtools)
  myFiles <- list.files(pattern = pattern, )
  mixedsort(myFiles)
}

然后,比较:

list.files(pattern = ".txt")
ListFiles(pattern = ".txt")
于 2013-04-11T09:15:12.463 回答
9

数字按字母顺序排列。对于基本 R 方法,您可以执行以下操作:

dat = sort(paste("DF", 1:100, ".txt", sep = ""))
numbers = as.numeric(regmatches(dat, regexpr("[0-9]+", dat)))
dat[order(numbers)]
  [1] "DF1.txt"   "DF2.txt"   "DF3.txt"   "DF4.txt"   "DF5.txt"   "DF6.txt"  
  [7] "DF7.txt"   "DF8.txt"   "DF9.txt"   "DF10.txt"  "DF11.txt"  "DF12.txt" 
 [13] "DF13.txt"  "DF14.txt"  "DF15.txt"  "DF16.txt"  "DF17.txt"  "DF18.txt" 
 [19] "DF19.txt"  "DF20.txt"  "DF21.txt"  "DF22.txt"  "DF23.txt"  "DF24.txt" 
 [25] "DF25.txt"  "DF26.txt"  "DF27.txt"  "DF28.txt"  "DF29.txt"  "DF30.txt" 
 [31] "DF31.txt"  "DF32.txt"  "DF33.txt"  "DF34.txt"  "DF35.txt"  "DF36.txt" 
 [37] "DF37.txt"  "DF38.txt"  "DF39.txt"  "DF40.txt"  "DF41.txt"  "DF42.txt" 
 [43] "DF43.txt"  "DF44.txt"  "DF45.txt"  "DF46.txt"  "DF47.txt"  "DF48.txt" 
 [49] "DF49.txt"  "DF50.txt"  "DF51.txt"  "DF52.txt"  "DF53.txt"  "DF54.txt" 
 [55] "DF55.txt"  "DF56.txt"  "DF57.txt"  "DF58.txt"  "DF59.txt"  "DF60.txt" 
 [61] "DF61.txt"  "DF62.txt"  "DF63.txt"  "DF64.txt"  "DF65.txt"  "DF66.txt" 
 [67] "DF67.txt"  "DF68.txt"  "DF69.txt"  "DF70.txt"  "DF71.txt"  "DF72.txt" 
 [73] "DF73.txt"  "DF74.txt"  "DF75.txt"  "DF76.txt"  "DF77.txt"  "DF78.txt" 
 [79] "DF79.txt"  "DF80.txt"  "DF81.txt"  "DF82.txt"  "DF83.txt"  "DF84.txt" 
 [85] "DF85.txt"  "DF86.txt"  "DF87.txt"  "DF88.txt"  "DF89.txt"  "DF90.txt" 
 [91] "DF91.txt"  "DF92.txt"  "DF93.txt"  "DF94.txt"  "DF95.txt"  "DF96.txt" 
 [97] "DF97.txt"  "DF98.txt"  "DF99.txt"  "DF100.txt"
于 2013-04-11T09:26:47.313 回答
5

或者,如果您想保持在非异教徒的界限内,您可以使用原始正则表达式。

> x <- paste("file", 1:20, ".txt", sep = "")
> sort(x)
 [1] "file1.txt"  "file10.txt" "file11.txt" "file12.txt" "file13.txt" "file14.txt" "file15.txt" "file16.txt" "file17.txt"
[10] "file18.txt" "file19.txt" "file2.txt"  "file20.txt" "file3.txt"  "file4.txt"  "file5.txt"  "file6.txt"  "file7.txt" 
[19] "file8.txt"  "file9.txt" 
> num.sort <- as.numeric(gsub("[^\\d]+", "\\1", x, perl = TRUE))
> x[sort(num.sort)]
 [1] "file1.txt"  "file2.txt"  "file3.txt"  "file4.txt"  "file5.txt"  "file6.txt"  "file7.txt"  "file8.txt"  "file9.txt" 
[10] "file10.txt" "file11.txt" "file12.txt" "file13.txt" "file14.txt" "file15.txt" "file16.txt" "file17.txt" "file18.txt"
[19] "file19.txt" "file20.txt"
于 2013-04-11T09:24:44.523 回答
2

这是一个自然排序与字母排序的问题。对我来说,名为 naturalsort 的包最适合以人类可读的方式显示文件名。

> #install.packages("naturalsort")
> library("naturalsort")
> x <-  paste(c("2file1","2file2","10file1.2","10file0.2","20file1","100",""))
> naturalsort(x)
[1] ""          "2file1"    "2file2"    "10file0.2" "10file1.2" "20file1"   "100"   
于 2018-02-27T10:03:41.617 回答