Ability to JSON serialize and deserialize int64 wi

2019-07-28 23:05发布

问题:

In R, Int64 whole numbers can not be accurately serialized to and from JSON, because existing JSON libraries will coerce the value into a numeric, or expect to represent the number in scientific notation.

Does anyone know of a way to accurately serialize and deserialize whole Int64 numbers to/from JSON with precision, or is a library modification (probably to RJSONIO) required?

The full story, including libraries I have tried so far, and the gacky workarounds necessary for the interim:

> library(gmp)
> library(bit64)
> library(rjson)
> library(RJSONIO)
> 
> options.bak <- getOption("digits")
> options(digits = 22)
> 
> #This is our value! 
> int64.text <- "5812766036735097952"
> #This whole number loses precision when stored as a numeric.
> as.bigz(int64.text) - as.numeric(int64.text)
Big Integer ('bigz') :
[1] 96
> 
> #PROBLEM 1: Deserialization from JSON
> 
> #rjson parses this number as a numeric, and demonstrates the same loss.
> json.text <- "{\"record.id\":5812766036735097952}"
> rjson.parsed <- rjson::fromJSON(json.text)$record.id
> str(rjson.parsed)
 num 5.81e+18
> as.bigz(int64.text) - as.bigz(rjson.parsed)
Big Integer ('bigz') :
[1] 96
> #so does RJSONIO, a library that allows you to specify floating point precision.
> rjsonio.parsed <- RJSONIO::fromJSON(json.text, digits = 50)["record.id"]
> as.bigz(int64.text) - as.bigz(rjsonio.parsed)
Big Integer ('bigz') :
[1] 96
> 
> #For now, I have solved this by hacking the JSON with some regex magic. Here's a snippet, although
> #   i'm really processing a much larger JSON string. 
> modified.json.text <- gsub("record.id\\\":([0-9]+)", "record.id\\\":\"\\1\"", json.text)
> id.text  <- fromJSON(modified.json.text)$record.id
Error in fromJSON(modified.json.text)$record.id : 
  $ operator is invalid for atomic vectors
> id.bigz <- as.bigz(int64.text)
> id.bigz - as.bigz(int64.text)
Big Integer ('bigz') :
[1] 0
> id.bigz
Big Integer ('bigz') :
[1] 5812766036735097952
> 
> #However, hacking the JSON isn't really a good solution, and relies upon there being convenient tags
> # nearby for the regex match to work. Being able to serialize to a precise data structure in the 
> # first place is best. Sorry R, there are largers number than 2^32
> 
> ###Problem 2: Deserialization 
> #Neither rjson and RJSONIO support bigz objects:
> rjson::toJSON(as.bigz(int64.text))
Error in rjson::toJSON(as.bigz(int64.text)) : 
  unable to convert R type 24 to JSON
> RJSONIO::toJSON(as.bigz(int64.text), digits = 50)
Error: evaluation nested too deeply: infinite recursion / options(expressions=)?
Error during wrapup: evaluation nested too deeply: infinite recursion / options(expressions=)?
> #Int64 will serialize, but with scientific notation:
> toJSON(as.integer64(int64.text))
[1] "[ 4.0156e+80 ]"
> RJSONIO::toJSON(as.integer64(int64.text, digits = 50))
[1] "[ 4.0156e+80 ]"
> 
> #So again, another JSON hack is in order:
> encoded.json.out <- toJSON(c(record.id = paste0("INT64", int64.text)))
> modified.json.out <- gsub("record.id\\\":\"INT64([0-9]+)\"", "record.id\\\":\\1", encoded.json.out)
> modified.json.out
[1] "{\n \"record.id\": \"INT645812766036735097952\" \n}"
> options(digits = options.bak)