1

为什么我无法读取下载的文件readLines?我该如何阅读它?

url="http://www.hkex.com.hk/chi/market/sec_tradinfo/stockcode/eisdeqty_c.htm"
txt=download.file(url,destfile="stock")
> file1=readLines("stock",encoding="big5")
Warning messages:
1: In readLines("stock", encoding = "big5") :
invalid input found on input connection 'stock'
2: In readLines("stock", encoding = "big5") :
incomplete final line found on 'stock'
> file1=readLines("stock",encoding="gbk")
Warning messages:
1: In readLines("stock", encoding = "gbk") :
invalid input found on input connection 'stock'
2: In readLines("stock", encoding = "gbk") :
incomplete final line found on 'stock'
> file1=readLines("stock",encoding="gb2132")
Warning messages:
1: In readLines("stock", encoding = "gb2132") :
invalid input found on input connection 'stock'
2: In readLines("stock", encoding = "gb2132") :
incomplete final line found on 'stock'
> file1=readLines("stock",encoding="gb18030")
Warning messages:
1: In readLines("stock", encoding = "gb18030") :
 invalid input found on input connection 'stock'
2: In readLines("stock", encoding = "gb18030") :
incomplete final line found on 'stock'

文件只包含部分内容,很多内容丢失,为什么?

4

1 回答 1

0

该文件包含 18 行,我的 R 读取了所有这 18 行。我怀疑您试图忽略文本文件和 HTML 文件之间的区别。要提取 HTML 表格,您需要使用类似这样的东西。

于 2012-09-06T06:36:38.247 回答