4

我正在尝试将二进制文件读入 R,但该文件具有以二进制代码编写的数据行。因此,它没有属于一列的完整数据集,而是存储为数据行。这是我的数据的样子:

Bytes 1-4:            int        ID
Byte 5:               char       response character
Bytes 6-9:            int        Resp Dollars
Byte 10:              char       Type char

任何人都可以帮我弄清楚如何将此文件读入R?

这是我到目前为止尝试过的代码。我尝试了几件事,但效果有限。不幸的是,我不能在公共网站上发布任何数据,抱歉。我对 R 比较陌生,所以在如何改进代码方面我需要一些帮助。

> binfile = file("File Location", "rb")
> IDvals = readBin(binfile, integer(), size=4, endian = "little")
> Responsevals = readBin(binfile, character (), size = 5)
> ResponseDollarsvals = readBin (binfile, integer (), size = 9, endian= "little")
Error in readBin(binfile, integer(), size = 9, endian = "little") : 
  size 9 is unknown on this machine
> Typevals = readBin (binfile, character (), size=4)
> binfile1= cbind(IDvals, Responsevals, ResponseDollarsvals, Typevals)
> dimnames(binfile1)[[2]]
[1] "IDvals"            "Responsevals"        "ResponseDollarsvals" "Typevals"  

> colnames(binfile1)=binfile
Error in `colnames<-`(`*tmp*`, value = 4L) : 
  length of 'dimnames' [2] not equal to array extent
4

1 回答 1

7

您可以将文件作为原始文件打开,然后发出 readBin 或 readChar 命令来获取每一行。随时将每个值附加到一列。

my.file <- file('path', 'rb')

id <- integer(0)
response <- character(0)
...

循环这个块:

id = c(id, readBin(my.file, integer(), size = 4, endian = 'little'))
response = c(response, readChar(my.file, 1))
...
readChar(my.file, size = 1) # For UNIX newlines.  Use size = 2 for Windows newlines.

然后创建您的数据框。

见这里: http: //www.ats.ucla.edu/stat/r/faq/read_binary.htm

于 2012-11-13T02:21:34.483 回答