1

在 R 中,Int64 整数无法准确地与 JSON 进行序列化,因为现有的 JSON 库会将值强制转换为数字,或者期望以科学计数法表示数字。

有谁知道一种方法可以精确地将整个 Int64 数字序列化和反序列化到 JSON 或从 JSON 精确地序列化,或者是否需要库修改(可能是 RJSONIO)?

完整的故事,包括我到目前为止尝试过的库,以及临时所需的古怪解决方法:

> library(gmp)
> library(bit64)
> library(rjson)
> library(RJSONIO)
> 
> options.bak <- getOption("digits")
> options(digits = 22)
> 
> #This is our value! 
> int64.text <- "5812766036735097952"
> #This whole number loses precision when stored as a numeric.
> as.bigz(int64.text) - as.numeric(int64.text)
Big Integer ('bigz') :
[1] 96
> 
> #PROBLEM 1: Deserialization from JSON
> 
> #rjson parses this number as a numeric, and demonstrates the same loss.
> json.text <- "{\"record.id\":5812766036735097952}"
> rjson.parsed <- rjson::fromJSON(json.text)$record.id
> str(rjson.parsed)
 num 5.81e+18
> as.bigz(int64.text) - as.bigz(rjson.parsed)
Big Integer ('bigz') :
[1] 96
> #so does RJSONIO, a library that allows you to specify floating point precision.
> rjsonio.parsed <- RJSONIO::fromJSON(json.text, digits = 50)["record.id"]
> as.bigz(int64.text) - as.bigz(rjsonio.parsed)
Big Integer ('bigz') :
[1] 96
> 
> #For now, I have solved this by hacking the JSON with some regex magic. Here's a snippet, although
> #   i'm really processing a much larger JSON string. 
> modified.json.text <- gsub("record.id\\\":([0-9]+)", "record.id\\\":\"\\1\"", json.text)
> id.text  <- fromJSON(modified.json.text)$record.id
Error in fromJSON(modified.json.text)$record.id : 
  $ operator is invalid for atomic vectors
> id.bigz <- as.bigz(int64.text)
> id.bigz - as.bigz(int64.text)
Big Integer ('bigz') :
[1] 0
> id.bigz
Big Integer ('bigz') :
[1] 5812766036735097952
> 
> #However, hacking the JSON isn't really a good solution, and relies upon there being convenient tags
> # nearby for the regex match to work. Being able to serialize to a precise data structure in the 
> # first place is best. Sorry R, there are largers number than 2^32
> 
> ###Problem 2: Deserialization 
> #Neither rjson and RJSONIO support bigz objects:
> rjson::toJSON(as.bigz(int64.text))
Error in rjson::toJSON(as.bigz(int64.text)) : 
  unable to convert R type 24 to JSON
> RJSONIO::toJSON(as.bigz(int64.text), digits = 50)
Error: evaluation nested too deeply: infinite recursion / options(expressions=)?
Error during wrapup: evaluation nested too deeply: infinite recursion / options(expressions=)?
> #Int64 will serialize, but with scientific notation:
> toJSON(as.integer64(int64.text))
[1] "[ 4.0156e+80 ]"
> RJSONIO::toJSON(as.integer64(int64.text, digits = 50))
[1] "[ 4.0156e+80 ]"
> 
> #So again, another JSON hack is in order:
> encoded.json.out <- toJSON(c(record.id = paste0("INT64", int64.text)))
> modified.json.out <- gsub("record.id\\\":\"INT64([0-9]+)\"", "record.id\\\":\\1", encoded.json.out)
> modified.json.out
[1] "{\n \"record.id\": \"INT645812766036735097952\" \n}"
> options(digits = options.bak)
4

0 回答 0