Struggling to import NZ companies extract into R (

2019-08-23 14:37发布

The NZ companies register offers a json file containing all publicly available business info. This file comes in at a whopping 40gb, but there is also a smaller json file (~250mb) containing data on unincorporated entities (sole traders etc). As a warm up excercise I thought i'd have a go importing it into R to get an idea of size, scalability and computational reqs.

I'm having alot of trouble importing the smaller json file into R. I've tried jsonlite, RJSONIO, rjson but it appears that the file is written in an 'unorthodox' json format, hence the standard 'fromJSON' commands are falling over. Below is a portion of the file (2 entities) which i've been trying to import into R: test.json

library(jsonlite)
json <- fromJSON("test.json", flatten=TRUE)

Error in parse_con(txt, bigint_as_char) : 
   parse error: invalid object key (must be a string)
      zbn": [{          "entity": [{            {               "australianBusinessNumbe
                 (right here) ------^

NB: JSONlint doesn't seem to think the file is a valied JSON file

My thought is that I may need to use stream_in() or readLines() but I am no very proficient with these functions. Any help or insight greatly appreciated. Cheers

0条回答
登录 后发表回答