我想从IMF将 .xls 文件从 URL 直接读取到 R 中,但到目前为止所有尝试都失败了。奇怪的是,我可以手动下载文件download.file()
,也可以在 Microsoft Outlook 或文本编辑器中毫无问题地打开它。但是,即使那样我也无法将数据读入 R。
我总是尝试同时使用https
和http
。
myUrl <- "https://www.imf.org/external/pubs/ft/weo/2019/02/weodata/WEOOct2019all.xls"
myUrl2 <- "http://www.imf.org/external/pubs/ft/weo/2019/02/weodata/WEOOct2019all.xls"
1. 经典方法——失败。
imf <- read.table(file=myUrl, sep="\t", header=TRUE)
# Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
# line 51 did not have 55 elements
imf <- read.table(file=url(myUrl), sep="\t", header=TRUE)
# Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
# line 51 did not have 55 elements
2. 几个包——失败。
imf <- readxl::read_xls(myUrl)
# Error: `path` does not exist: ‘https://www.imf.org/external/pubs/ft/weo/2019/02/weodata/WEOOct2019all.xls’
imf <- readxl::read_xls(myUrl2)
# Error: `path` does not exist: ‘http://www.imf.org/external/pubs/ft/weo/2019/02/weodata/WEOOct2019all.xls’
imf <- gdata::read.xls(myUrl)
# Error in xls2sep(xls, sheet, verbose = verbose, ..., method = method, :
# Intermediate file 'C:\Users\jay\AppData\Local\Temp\RtmpUtW45x\file16f873be18e0.csv' missing!
# In addition: Warning message:
# In system(cmd, intern = !verbose) :
# running command '"C:\STRAWB~1\perl\bin\perl.exe"
# "C:/Program Files/R/R-3.6.1rc/library/gdata/perl/xls2csv.pl"
# "https://www.imf.org/external/pubs/ft/weo/2019/02/weodata/WEOOct2019all.xls"
# "C:\Users\jay\AppData\Local\Temp\RtmpUtW45x\file16f873be18e0.csv" "1"' had status 2
# Error in file.exists(tfn) : invalid 'file' argument
imf <- gdata::read.xls(myUrl2) # <---------------------------------------------- THIS DOWNLOADS SOMETHING AT LEAST!
# trying URL 'http://www.imf.org/external/pubs/ft/weo/2019/02/weodata/WEOOct2019all.xls'
# Content type 'application/vnd.ms-excel' length unknown
# downloaded 8.9 MB
#
# Error in xls2sep(xls, sheet, verbose = verbose, ..., method = method, :
# Intermediate file 'C:\Users\jay\AppData\Local\Temp\RtmpUtW45x\file16f87ded406b.csv' missing!
# In addition: Warning message:
# In system(cmd, intern = !verbose) :
# running command '"C:\STRAWB~1\perl\bin\perl.exe"
# "C:/Program Files/R/R-3.6.1rc/library/gdata/perl/xls2csv.pl"
# "C:\Users\jay\AppData\Local\Temp\RtmpUtW45x\file16f87f532cb3.xls"
# "C:\Users\jay\AppData\Local\Temp\RtmpUtW45x\file16f87ded406b.csv" "1"' had status 255
# Error in file.exists(tfn) : invalid 'file' argument
3. Tempfile 方法——失败。
temp <- tempfile()
download.file(myUrl, temp) # THIS WORKS...
## BUT...
imf <- gdata::read.xls(temp)
# Error in xls2sep(xls, sheet, verbose = verbose, ..., method = method, :
# Intermediate file 'C:\Users\jay\AppData\Local\Temp\RtmpUtW45x\file16f870f55e04.csv' missing!
# In addition: Warning message:
# In system(cmd, intern = !verbose) :
# running command '"C:\STRAWB~1\perl\bin\perl.exe"
# "C:/Program Files/R/R-3.6.1rc/library/gdata/perl/xls2csv.pl"
# "C:\Users\jay\AppData\Local\Temp\RtmpUtW45x\file16f8746a46db"
# "C:\Users\jay\AppData\Local\Temp\RtmpUtW45x\file16f870f55e04.csv" "1"' had status 255
# Error in file.exists(tfn) : invalid 'file' argument
# even not...
tmp1 <- readLines(temp)
# Warning message:
# In readLines(temp) :
# incomplete final line found on
# 'C:\Users\jay\AppData\Local\Temp\Rtmp00GPlq\file2334435c2905'
str(tmp1)
# chr [1:8733] "WEO Country Code\tISO\tWEO Subject Code\tCountry\tSubject
# Descriptor\tSubject Notes\tUnits\tScale\tCountry/Seri"| __truncated__ ...
4. SDMX
我也试过SDMX IMF的offer,也没有成功。可能这将是一种更复杂的方法,但我从未使用过 SDMX。
link <- "https://www.imf.org/external/pubs/ft/weo/2019/02/weodata/WEOOct2019_SDMXData.zip"
temp <- tempfile()
download.file(link, temp, quiet=TRUE)
imf <- rsdmx::readSDMX(temp)
# Error in function (type, msg, asError = TRUE) :
# Could not resolve host: C
# imf <- rsdmx::readSDMX(unzip(temp)) # runs forever and crashes R
unlink(temp)
现在...有人知道发生了什么,以及如何将数据加载到 R 中吗?