Writing heirarchical JSON data to Excel xls from P

2019-04-15 07:39发布

问题:

I want to write some data from python to xlsx. I currently have it stored as JSON, but it doesn't matter what it is going out of Python. Here's what the JSON for a single article would look like:

{ 
   'Word Count': 50
   'Key Words': { 
                  ['Blah blah blah', 'Foo', ... ] }
   'Frequency': {
                  [9, 12, ... ] }
   'Proper Nouns': { 
                  ['UN', 'USA', ... ] }
   'Location': 'Mordor'
}

I checked out the XlsxWriter module but can't figure out how to translate hierarchical data that is not necessarily the same size (note the number of proper nouns between the two data "objects").

What I want the data to look like:

Any pointers?

回答1:

As your structures can be arbitrarily nested, I would suggest using recursion to achieve this:

from collections import OrderedDict
import xlsxwriter
import json

def json_to_excel(ws, data, row=0, col=0):
    if isinstance(data, list):
        row -= 1
        for value in data:
            row = json_to_excel(ws, value, row+1, col)
    elif isinstance(data, dict):
        max_row = row
        start_row = row
        for key, value in data.iteritems():
            row = start_row
            ws.write(row, col, key)
            row = json_to_excel(ws, value, row+1, col)
            max_row = max(max_row, row)
            col += 1
        row = max_row
    else:
        ws.write(row, col, data)

    return row

text = """
[
    {
        "Source ID": 123,
        "WordCount": 50,
        "Key Words": ["Blah blah blah", "Foo"],
        "Frequency": [9, 12, 1, 2, 3],
        "Proper Nouns": ["UN", "USA"],
        "Location": "Mordor"
    },
    {
        "Source ID": 124,
        "WordCount": 50,
        "Key Words": ["Blah blah blah", "Foo"],
        "Frequency": [9, 12, 1, 2, 3],
        "Proper Nouns": ["UN", "USA"],
        "Location": "Mordor"
    }
]
"""

data = json.loads(text, object_pairs_hook=OrderedDict)
wb = xlsxwriter.Workbook("output.xlsx")
ws = wb.add_worksheet()
json_to_excel(ws, data)
wb.close()  

This would give you an output file looking like: