ValueError: Extra Data error when importing json f

2019-07-18 07:14发布

I'm trying to build a python script that imports json files into a MongoDB. This part of my script keeps jumping to the except ValueError for larger json files. I think it has something to do with parsing the json file line by line because very small json files seem to work.

def read(jsonFiles):
from pymongo import MongoClient

client = MongoClient('mongodb://localhost:27017/')
db = client[args.db]

counter = 0
for jsonFile in jsonFiles:
    with open(jsonFile, 'r') as f:
        for line in f:
            # load valid lines (should probably use rstrip)
            if len(line) < 10: continue
            try:
                db[args.collection].insert(json.loads(line))
                counter += 1
            except pymongo.errors.DuplicateKeyError as dke:
                if args.verbose:
                    print "Duplicate Key Error: ", dke
            except ValueError as e:
                if args.verbose:
                    print "Value Error: ", e

                    # friendly log message
            if 0 == counter % 100 and 0 != counter and args.verbose: print "loaded line:", counter
            if counter >= args.max:
                break

I'm getting the following error message:

Value Error:  Extra data: line 1 column 10 - line 2 column 1 (char 9 - 20)
Value Error:  Extra data: line 1 column 8 - line 2 column 1 (char 7 - 18)

2条回答
看我几分像从前
2楼-- · 2019-07-18 07:21

Figured it out. Looks like breaking it up into lines was the mistake. Here's what the final code looks like.

counter = 0
for jsonFile in jsonFiles:
    with open(jsonFile) as f:
        data = f.read()
        jsondata = json.loads(data)
        try:
            db[args.collection].insert(jsondata)
            counter += 1
查看更多
爷、活的狠高调
3楼-- · 2019-07-18 07:28

Look at this example:

s = """{ "data": { "one":1 } },{ "1": { "two":2 } }"""
json.load( s )

It will produce the "Extra data" error like in your json file:

ValueError: Extra data: line 1 column 24 - line 1 column 45 (char 23 - 44)

This is because this is not a valid JSON object. It contains two independend "dict"s, separated by a colon. Perhaps this could help you finding the error in your JSON file.

in this post you find more information.

查看更多
登录 后发表回答