I have a long json like this: http://pastebin.com/gzhHEYGy
I would like to place it into a pandas datframe in order to play with it, so by the documentation I do the following:
df = pd.read_json('/user/file.json')
print df
I got this traceback:
File "/Users/user/PycharmProjects/PAN-pruebas/json_2_dataframe.py", line 6, in <module>
df = pd.read_json('/Users/user/Downloads/54db3923f033e1dd6a82222aa2604ab9.json')
File "/usr/local/lib/python2.7/site-packages/pandas/io/json.py", line 198, in read_json
date_unit).parse()
File "/usr/local/lib/python2.7/site-packages/pandas/io/json.py", line 266, in parse
self._parse_no_numpy()
File "/usr/local/lib/python2.7/site-packages/pandas/io/json.py", line 483, in _parse_no_numpy
loads(json, precise_float=self.precise_float), dtype=None)
File "/usr/local/lib/python2.7/site-packages/pandas/core/frame.py", line 203, in __init__
mgr = self._init_dict(data, index, columns, dtype=dtype)
File "/usr/local/lib/python2.7/site-packages/pandas/core/frame.py", line 327, in _init_dict
dtype=dtype)
File "/usr/local/lib/python2.7/site-packages/pandas/core/frame.py", line 4620, in _arrays_to_mgr
index = extract_index(arrays)
File "/usr/local/lib/python2.7/site-packages/pandas/core/frame.py", line 4668, in extract_index
raise ValueError('arrays must all be same length')
ValueError: arrays must all be same length
Then from a previous question I found that I need to do something like this:
d = dict( A = np.array([1,2]), B = np.array([1,2,3,4]) )
But I dont get how should I obtain the contents like a numpy array. How can I preserve the length of the arrays in a big file like this?. Thanks in advance.
The json method doesnt work as the json file is not in the format it expects. As we can easily load a json as a dict, let's try this way :
output: