I have a .txt file with this structure
section1#[{"p": "0.999834", "tag": "MA"},{"p": "1", "tag": "MO"},...etc...}]
section1#[{"p": "0.9995", "tag": "NC"},{"p": "1", "tag": "FL"},...etc...}]
...
section2#[{"p": "0.9995", "tag": "NC"},{"p": "1", "tag": "FL"},...etc...}]
I am trying to read it by using R with the commands
library(jsonlite)
data <- fromJSON("myfile.txt")
But I get this
Error in feed_push_parser(readBin(con, raw(), n), reset = TRUE) :
lexical error: invalid char in json text.
section2#[{"p": "0.99
(right here) ------^
How can I read it even by splitting by sections?
Remove the prefix and bind the flattened JSON arrays together into a data frame:
raw_dat <- readLines(textConnection('section1#[{"p": "0.999834", "tag": "MA"},{"p": "1", "tag": "MO"}]
section1#[{"p": "0.9995", "tag": "NC"},{"p": "1", "tag": "FL"}]
section2#[{"p": "0.9995", "tag": "NC"},{"p": "1", "tag": "FL"}]'))
library(stringi)
library(purrr)
library(jsonlite)
stri_replace_first_regex(raw_dat, "^section[[:digit:]]+#", "") %>%
map_df(fromJSON)
## p tag
## 1 0.999834 MA
## 2 1 MO
## 3 0.9995 NC
## 4 1 FL
## 5 0.9995 NC
## 6 1 FL
Remove section#
from each line. Then your .txt will have a 2D array with JSON objects at each index.
You can access elements by accessing it as foo[0][0]
being the first object of first line and foo[m][n]
where m
is the number of sections -1
and n
is number of objects in each section -1