how to load a json into a pandas dataframe?

2019-04-06 06:20发布

问题:

I am using a REST API to get a json file as follows:

import urllib2
import pandas as pd
import numpy as np
import requests

request='myrequest'
data= requests.get(request)
json=data.json()
df=pd.DataFrame(json)

and the dataframe looks like

                                               items
0  {u'access': u'all', u'count': 501, u'time': 2014}
1  {u'access': u'all', u'count': 381, u'time': 2015}

How can I transform this single column (that looks like a dictionary) into proper columns in Pandas?

EDIT

the raw json data looks like this

{
  "items": [
    {
      "access": "all",
      "count": 200,
      "time": 2015
    },
    {
      "access": "all",
      "count": 14,
      "time": 2015
    },
  ]
}

Thanks!

回答1:

pd.read_json(json_str)

Here is the Pandas documentation.

EDIT:

For a list of json str you can also just:

import json
import pandas as pd

df = pd.DataFrame.from_records(map(json.loads, json_lst))


回答2:

Well, it seems to me that JSON import to nesting containing any variations of dicts and list, while Pandas require a single dict collection with iterable elements. You therefore have to do a little bit of conversion if they do not match.

Assuming I interpret the structure of your JSON correctly (and I might not since, you are only printing the end product, not the JSON structure), it looks like it is a list of dictionaries. If that is the case, here is the solution:

data = {k:[v] for k,v in json[0].items()}
for jso in json[1:]:
    for k,v in jso.items():
      data[k].append(v)

df = pd.DataFrame(data)

Edit:

Values are provided, to get my code working, you just need the following in front:

json = json["items"]

I think this should work, but it depends on how requests processes JSON. Give me a printout of the json object if it doesn't work.