尝试使用in readtext library
(附带的quanteda library
)解析超过 7000 个 txt 文件R
,我收到以下警告。
警告消息:在(函数(...,deparse.level = 1):结果的列数不是向量长度的倍数(arg 2030)
如何确定哪个 txt 文件导致警告?
使用详细选项不会显示警告是否发生。为了您的信息,尝试解析两个文件我得到以下信息(b2w,如果我一次只解析 1 个文档,则不会显示警告)。
从 /Users/OS/surfdrive/Competenties/Data-step-1/BinnenlandsBestuur/1982/9-12/Office Lens 20170308-102311.jpg.txt 读取文本从 /Users/OS/surfdrive/Competenties/Data-step- 读取文本1/BinnenlandsBestuur/1983/Office Lens 20170308-103518.jpg.txt,使用 glob 模式...阅读(txt)文件:Office Lens 20170308-102311.jpg.txt,使用 glob 模式...阅读(txt)文件: Office Lens 20170308-103518.jpg.txt 读取 2 个文档。警告消息:1:在(函数(...,deparse.level = 1):结果的列数不是向量长度的倍数(arg 2)2:在if(verbosity == 2&nchar(msg) > 70) pad <- paste0("\n", pad) : 条件长度 > 1 并且只使用第一个元素
Session info
R version 3.4.0 (2017-04-21)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.5
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] C/C/C/C/C/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] tm.plugin.webmining_1.3 XML_3.98-1.7 readtext_0.50 RoogleVision_0.0.1.1
[5] outliers_0.14 stringdist_0.9.4.4 ltm_1.0-0 polycor_0.7-9
[9] msm_1.6.4 MASS_7.3-47 psych_1.7.5 WriteXLS_4.0.0
[13] plyr_1.8.4 metafor_2.0-0 Matrix_1.2-9 metaSEM_0.9.14
[17] OpenMx_2.7.12 xlsx_0.5.7 xlsxjars_0.6.1 rJava_0.9-8
[21] readxl_1.0.0 quanteda_0.9.9-65 koRpus.lang.nl_0.01-3 koRpus_0.11-1
[25] sylly_0.1-1 jsonlite_1.5 httr_1.2.1
loaded via a namespace (and not attached):
[1] sylly.ru_0.1-1 splines_3.4.0 ellipse_0.3-8 RcppParallel_4.3.20 shiny_1.0.3
[6] sylly.it_0.1-1 expm_0.999-2 sylly.es_0.1-1 cellranger_1.1.0 slam_0.1-40
[11] yaml_2.1.14 backports_1.1.0 lattice_0.20-35 digest_0.6.12 googleAuthR_0.5.1
[16] colorspace_1.3-2 htmltools_0.3.6 httpuv_1.3.3 tm_0.7-1 devtools_1.13.2
[21] xtable_1.8-2 mvtnorm_1.0-6 scales_0.4.1 tibble_1.3.3 openssl_0.9.6
[26] ggplot2_2.2.1 withr_1.0.2 lazyeval_0.2.0 NLP_0.1-10 mnormt_1.5-5
[31] RJSONIO_1.3-0 survival_2.41-3 magrittr_1.5 mime_0.5 memoise_1.1.0
[36] evaluate_0.10 boilerpipeR_1.3 nlme_3.1-131 foreign_0.8-67 rsconnect_0.8
[41] tools_3.4.0 data.table_1.10.4 stringr_1.2.0 munsell_0.4.3 compiler_3.4.0
[46] rlang_0.1.1 grid_3.4.0 RCurl_1.95-4.8 bitops_1.0-6 rmarkdown_1.5
[51] gtable_0.2.0 curl_2.6 R6_2.2.2 sylly.en_0.1-1 knitr_1.16
[56] fastmatch_1.1-0 sylly.fr_0.1-1 rprojroot_1.2 stringi_1.1.5 parallel_3.4.0
[61] sylly.de_0.1-1 Rcpp_0.12.11
谢谢你,彼得
PS。如果此信息不足,我将在 github 页面上发布一个可重现的示例。