MongoDB: mongoimport loses connection when importi

2019-01-13 18:02发布

问题:

I have some trouble importing a JSON file to a local MongoDB instance. The JSON was generated using mongoexport and looks like this. No arrays, no hardcore nesting:

{"_created":{"$date":"2015-10-20T12:46:25.000Z"},"_etag":"7fab35685eea8d8097656092961d3a9cfe46ffbc","_id":{"$oid":"562637a14e0c9836e0821a5e"},"_updated":{"$date":"2015-10-20T12:46:25.000Z"},"body":"base64 encoded string","sender":"mail@mail.com","type":"answer"}
{"_created":{"$date":"2015-10-20T12:46:25.000Z"},"_etag":"7fab35685eea8d8097656092961d3a9cfe46ffbc","_id":{"$oid":"562637a14e0c9836e0821a5e"},"_updated":{"$date":"2015-10-20T12:46:25.000Z"},"body":"base64 encoded string","sender":"mail@mail.com","type":"answer"}

If I import a 9MB file with ~300 rows, there is no problem:

[stekhn latest]$ mongoimport -d mietscraping -c mails mails-small.json 
2015-11-02T10:03:11.353+0100    connected to: localhost
2015-11-02T10:03:11.372+0100    imported 240 documents

But if try to import a 32MB file with ~1300 rows, the import fails:

[stekhn latest]$ mongoimport -d mietscraping -c mails mails.json 
2015-11-02T10:05:25.228+0100    connected to: localhost
2015-11-02T10:05:25.735+0100    error inserting documents: lost connection to server
2015-11-02T10:05:25.735+0100    Failed: lost connection to server
2015-11-02T10:05:25.735+0100    imported 0 documents

Here is the log:

2015-11-02T11:53:04.146+0100 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:45237 #21 (6 connections now open)
2015-11-02T11:53:04.532+0100 I -        [conn21] Assertion: 10334:BSONObj size: 23592351 (0x167FD9F) is invalid. Size must be between 0 and 16793600(16MB) First element: insert: "mails"
2015-11-02T11:53:04.536+0100 I NETWORK  [conn21] AssertionException handling request, closing client connection: 10334 BSONObj size: 23592351 (0x167FD9F) is invalid. Size must be between 0 and 16793600(16MB) First element: insert: "mails"

I've heard about the 16MB limit for BSON documents before, but since no row in my JSON file is bigger than 16MB, this shouldn't be a problem, right? When I do the exact same (32MB) import one my local computer, everything works fine.

Any ideas what could cause this weird behaviour?

回答1:

I guess the problem is about performance, any way you can solved used:

you can use mongoimport option -j. Try increment if not work with 4. i.e, 4,8,16, depend of the number of core you have in your cpu.

mongoimport --help

-j, --numInsertionWorkers= number of insert operations to run concurrently (defaults to 1)


mongoimport -d mietscraping -c mails -j 4 < mails.json


or you can split the file and import all files.

I hope this help you.


looking a little more, is a bug in some version https://jira.mongodb.org/browse/TOOLS-939 here another solution you can change the batchSize, for default is 10000, reduce the value and test:

mongoimport -d mietscraping -c mails < mails.json --batchSize 1



标签: mongodb