Importing json from file into mongodb using mongoi

2019-01-21 21:05发布

问题:

I have my json_file.json like this:

[
{
    "project": "project_1",
    "coord1": 2,
    "coord2": 10,
    "status": "yes",
    "priority": 7
},
{
    "project": "project_2",
    "coord1": 2,
    "coord2": 10,
    "status": "yes",
    "priority": 7
},
{
    "project": "project_3",
    "coord1": 2,
    "coord2": 10,
    "status": "yes",
    "priority": 7
}
]

When I run the following command to import this into mongodb:

mongoimport --db my_db --collection my_collection --file json_file.json 

I get the following error:

Failed: error unmarshaling bytes on document #0: JSON decoder out of sync - data changing underfoot?

If I add the --jsonArray flag to the command I import like this:

imported 3 documents

instead of one document with the json format as shown in the original file.

How can I import json into mongodb with the original format in the file shown above?

回答1:

Perhaps the following reference from the MongoDB project blog could help you gain insight on how arrays work in Mongo:

https://blog.mlab.com/2013/04/thinking-about-arrays-in-mongodb/

I would frame your import otherwise, and either:

a) import the three different objects separately into the collection as you say, using the --jsonArray flag; or

b) encapsulate the complete array within a single object, for example in this way:

{
"mydata": 
    [
    {
          "project": "project_1",
          ...
          "priority": 7
    }
    ]
}

HTH.



回答2:

The mongoimport tool has an option:
--jsonArray treat input source as a JSON array
Or it is possible to import from file
containing same data format as
the result of db.collection.find() command.
Here is example from university.mongodb.com courseware
some content from grades.json:

{ "_id" : { "$oid" : "50906d7fa3c412bb040eb577" }, "student_id" : 0, "type" : "exam", "score" : 54.6535436362647 }
{ "_id" : { "$oid" : "50906d7fa3c412bb040eb578" }, "student_id" : 0, "type" : "quiz", "score" : 31.95004496742112 }
{ "_id" : { "$oid" : "50906d7fa3c412bb040eb579" }, "student_id" : 0,       "type" : "homework", "score" : 14.8504576811645 }

As you can see,
no array used and
no comma delimiters between documents either.

I discover, recently,
that this complies with the JSON Lines text format .
Like one used in apache.spark.sql.DataFrameReader.json() method .



回答3:

I faced opposite problem today, my conclusion would be:

If you wish to insert array of JSON objects at once, where each array entry shall be treated as separate dtabase entry, you have two options of syntax:

  1. Array of object with valid coma positions & --jsonArray flag obligatory

    [
      {obj1},
      {obj2},
      {obj3}
    ]
    
  2. Use file with basically incorrect JSON formatting (i.e. missing , between JSON object instances & without --jsonArray flag

    {obj1}
    {obj2}
    {obj3}
    

If you wish to insert only an array (i.e. array as top-level citizen of your database) I think it's not possible and not valid, because mongoDB by definition supports documents as top-level objects which are mapped to JSON objects afterwards. In other words, you must wrap your array into JSON object as ALAN WARD pointed out.



标签: mongodb