我有一个 entrez 命令,我正在通过 R 中的一个循环,它似乎在一段时间内工作得很好,但我最终得到了一个我很难弄清楚的错误。
Error in system(command = paste0(<string modified in loop>, :
cannot popen '<paste'd string>', probable reason 'Too many open files'
下面的循环将在第 1020 个迭代器上开始失败:
GeneraAddresses <- vector(mode = "list",
length = length(PossibleGenera))
for (m1 in seq_along(GeneraAddresses)) {
GeneraAddresses[[m1]] <- try(system(command = paste0("esearch -db assembly -query ",
"'",
PossibleGenera[m1],
"[organism] AND \"complete genome\"[filter]",
" AND \"latest genbank\"[filter]",
" AND \"genbank has annotation\"[Properties]",
"'",
" | ",
"efetch -format docsum",
" | ",
"xtract -pattern DocumentSummary -block FtpPath",
' -match "@type:genbank"',
" -element FtpPath"),
timeout = 300L,
intern = TRUE))
print(showConnections(all = TRUE))
print(m1)
closeAllConnections()
}
在这种情况下,您不需要我试图从中提取的属的实际载体:
rep("streptomyces", 2000) -> PossibleGenera
应该做得很好。我一直无法找到使错误更早出现的方法(以便于诊断),并且要求 R 关闭所有打开的连接似乎没有帮助(包含在上面的代码中)。我知道我可以诉诸将我的向量分解成更小的部分并以这种方式运行,但这似乎有点像放弃。
在 MacOS 上运行:
> sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.4
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats4 parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] Heron_0.0.0.9023 igraph_1.2.4.2 ape_5.3 stringr_1.4.0 DECIPHER_2.14.0 RSQLite_2.2.0 Biostrings_2.54.0
[8] XVector_0.26.0 IRanges_2.20.2 S4Vectors_0.24.3 BiocGenerics_0.32.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.3 magrittr_1.5 zlibbioc_1.32.0 bit_1.1-15.1 lattice_0.20-38 rlang_0.4.3 blob_1.2.1 tools_3.6.2 grid_3.6.2
[10] nlme_3.1-143 DBI_1.1.0 bit64_0.9-7 digest_0.6.23 vctrs_0.2.2 memoise_1.1.0 stringi_1.4.5 compiler_3.6.2 pkgconfig_2.0.3