I want to write some data from python to xlsx. I currently have it stored as JSON, but it doesn't matter what it is going out of Python. Here's what the JSON for a single article would look like:
{
'Word Count': 50
'Key Words': {
['Blah blah blah', 'Foo', ... ] }
'Frequency': {
[9, 12, ... ] }
'Proper Nouns': {
['UN', 'USA', ... ] }
'Location': 'Mordor'
}
I checked out the XlsxWriter module but can't figure out how to translate hierarchical data that is not necessarily the same size (note the number of proper nouns between the two data "objects").
What I want the data to look like:
Any pointers?
As your structures can be arbitrarily nested, I would suggest using recursion to achieve this:
from collections import OrderedDict
import xlsxwriter
import json
def json_to_excel(ws, data, row=0, col=0):
if isinstance(data, list):
row -= 1
for value in data:
row = json_to_excel(ws, value, row+1, col)
elif isinstance(data, dict):
max_row = row
start_row = row
for key, value in data.iteritems():
row = start_row
ws.write(row, col, key)
row = json_to_excel(ws, value, row+1, col)
max_row = max(max_row, row)
col += 1
row = max_row
else:
ws.write(row, col, data)
return row
text = """
[
{
"Source ID": 123,
"WordCount": 50,
"Key Words": ["Blah blah blah", "Foo"],
"Frequency": [9, 12, 1, 2, 3],
"Proper Nouns": ["UN", "USA"],
"Location": "Mordor"
},
{
"Source ID": 124,
"WordCount": 50,
"Key Words": ["Blah blah blah", "Foo"],
"Frequency": [9, 12, 1, 2, 3],
"Proper Nouns": ["UN", "USA"],
"Location": "Mordor"
}
]
"""
data = json.loads(text, object_pairs_hook=OrderedDict)
wb = xlsxwriter.Workbook("output.xlsx")
ws = wb.add_worksheet()
json_to_excel(ws, data)
wb.close()
This would give you an output file looking like: