JSON: save one dict per line

2020-07-06 04:44发布

How do I save a list of python dictionaries to a file, where each dictwill be saved in one line? I know I can use json.dump to save the list of dictionaries. But I can only save the list in compact form (the full list in one line) or indented, where for all dictionaries keys a newline is added.

EDIT:

I want my final json file to look like this:

[{key1:value,key2:value},
{key1:value,key2:value},
...
{key1:value,key2:value}]

标签: python json
4条回答
欢心
2楼-- · 2020-07-06 05:30

Your final file.json example is not a valid JSON. Assuming you want to just convey the form with it you might try extending the json.JSONEncoder, but assuming you don't have nested structures in your dictionaries a quick and dirty approach would be to just construct the file manually, i.e.

import json

your_data = [  # lets define some test data
    {"key1.0": "value", "key2.0": "value"},
    {"key1.1": "value", "key2.1": "value"},
    {"key1.2": "value", "key2.2": "value"},
    {"key1.3": "value", "key2.3": "value"},
]

with open("file.json", "w") as f:  # open our file for writing
    f.write("[")  # begin a JSON array
    if your_data:  # a check to determine that our array is not empty
        for element in your_data:  # now loop through your elements one by one
            json.dump(element, f)  # JSON encode each element and write it to the file
            f.write(",\n")  # close the element entry with a comma and a new line
        f.seek(-3, 1)  # go back to the last separator to clear out the comma
    f.write("]")  # end the JSON array
    f.truncate()  # remove the rest, just in case

Which will produce:

[{"key1.0": "value", "key2.0": "value"},
{"key1.1": "value", "key2.1": "value"},
{"key1.2": "value", "key2.2": "value"},
{"key1.3": "value", "key2.3": "value"}]
查看更多
疯言疯语
3楼-- · 2020-07-06 05:40

I agree with another response -- the best you can do is to json.dump each dict individually and write the commas and newlines manually. Here is how I would do that:

import json

data = [
    {"key01":"value","key02":"value"},
    {"key11":"value","key12":"value"},
    {"key21":"value","key22":"value"}
]

import json
with open('file.json', 'w') as fp:
    fp.write(
        '[' +
        ',\n'.join(json.dumps(i) for i in data) +
        ']\n')

Result:

[{"key01": "value", "key02": "value"},
{"key12": "value", "key11": "value"},
{"key22": "value", "key21": "value"}]
查看更多
Explosion°爆炸
4楼-- · 2020-07-06 05:41

For fun I adapted my answer to another somewhat related question to make it do what you want. Note that currently it only changes the formatting of a dict if it's in a list.

import _ctypes
import json
import re

class OneDictPerLine(object):
    def __init__(self, value):
        self.value = value
    def __repr__(self):
        if not isinstance(self.value, list):
            return repr(self.value)
        else:  # Sort the representation of any dicts in the list.
            reps = ('{{{}}}'.format(', '.join(
                        ('{!r}: {}'.format(k, v) for k, v in sorted(v.items()))
                    )) if isinstance(v, dict)
                        else
                    repr(v) for v in self.value)
            return '[' + ',\n'.join(reps) + ']'


def di(obj_id):
    """ Reverse of id() function. """
    # from https://stackoverflow.com/a/15012814/355230
    return _ctypes.PyObj_FromPtr(obj_id)


class MyEncoder(json.JSONEncoder):
    FORMAT_SPEC = "@@{}@@"
    regex = re.compile(FORMAT_SPEC.format(r"(\d+)"))

    def default(self, obj):
        return (self.FORMAT_SPEC.format(id(obj)) if isinstance(obj, OneDictPerLine)
                else super(MyEncoder, self).default(obj))

    def encode(self, obj):
        format_spec = self.FORMAT_SPEC  # Local var to expedite access.
        json_repr = super(MyEncoder, self).encode(obj)  # Default JSON repr.

        # Replace any marked-up object ids in the JSON repr with the value
        # returned from the repr() of the corresponding Python object.
        for match in self.regex.finditer(json_repr):
            id = int(match.group(1))
            # Replace marked-up id with actual Python object repr().
            json_repr = json_repr.replace(
                       '"{}"'.format(format_spec.format(id)), repr(di(id)))

        return json_repr

Sample usage:

# Sample usage
data = [
    {"key01":"value","key02":"value"},
    {"key11":"value","key12":"value"},
    {"key21":"value","key22":"value"},
    {'key{:02d}:"value"'.format(k) for k in range(100)}
]

print(json.dumps(OneDictPerLine(data), cls=MyEncoder))

Output:

[{'key01': value, 'key02': value},
{'key11': value, 'key12': value},
{'key21': value, 'key22': value},
{'key93:"value"', 'key05:"value"', 'key00:"value"', 'key33:"value"', 'key55:"value"', 'key91:"value"', 'key18:"value"', 'key76:"value"', 'key25:"value"', 'key72:"value"', 'key21:"value"', 'key54:"value"', 'key12:"value"', 'key61:"value"', 'key96:"value"', 'key87:"value"', 'key71:"value"', 'key03:"value"', 'key66:"value"', 'key58:"value"', 'key85:"value"', 'key11:"value"', 'key64:"value"', 'key75:"value"', 'key27:"value"', 'key86:"value"', 'key29:"value"', 'key31:"value"', 'key69:"value"', 'key15:"value"', 'key62:"value"', 'key45:"value"', 'key49:"value"', 'key40:"value"', 'key39:"value"', 'key78:"value"', 'key98:"value"', 'key28:"value"', 'key19:"value"', 'key42:"value"', 'key60:"value"', 'key04:"value"', 'key84:"value"', 'key56:"value"', 'key83:"value"', 'key10:"value"', 'key34:"value"', 'key77:"value"', 'key80:"value"', 'key68:"value"', 'key99:"value"', 'key38:"value"', 'key67:"value"', 'key59:"value"', 'key52:"value"', 'key57:"value"', 'key23:"value"', 'key14:"value"', 'key26:"value"', 'key90:"value"', 'key09:"value"', 'key07:"value"', 'key35:"value"', 'key73:"value"', 'key41:"value"', 'key17:"value"', 'key48:"value"', 'key44:"value"', 'key82:"value"', 'key65:"value"', 'key47:"value"', 'key95:"value"', 'key88:"value"', 'key97:"value"', 'key63:"value"', 'key22:"value"', 'key51:"value"', 'key50:"value"', 'key36:"value"', 'key06:"value"', 'key30:"value"', 'key32:"value"', 'key08:"value"', 'key79:"value"', 'key89:"value"', 'key20:"value"', 'key70:"value"', 'key46:"value"', 'key94:"value"', 'key53:"value"', 'key92:"value"', 'key81:"value"', 'key13:"value"', 'key43:"value"', 'key24:"value"', 'key16:"value"', 'key02:"value"', 'key74:"value"', 'key01:"value"', 'key37:"value"'}]
查看更多
SAY GOODBYE
5楼-- · 2020-07-06 05:45

This may not generate exactly what the OP wanted, but to pretty print JSONs generally, you can add an indent argument:

json.dump(data, json_path.open("w"), indent=2)

Example output:

  "k07-689z-01": {
    "image_path": "v_1280/k07-689z-01.tif",
    "gt": "Instead he said: Will that be better? She nodded: And she might add: And don't forget to",
    "components": [
      "k07-689z-01a.tif",
      "k07-689z-01b.tif",
    ],
  },

This converts a 1-line dictionary to one where each key/subelement has it's own line. You can also change the "separators" command to alter how lines are split, see https://docs.python.org/3.7/library/json.html#basic-usage.

查看更多
登录 后发表回答