We're running into a situation where documents that exist in our MongoDB can't be queried for without resulting in:
db.collection.find({"_id":ObjectId("50d393be70a580280b117ea5")})
Wed Jan 2 12:30:44 Assertion: 10320:BSONElement: bad type 65
0x6073f1 0x5d1aa9 0x4b0d98 0x5c17a6 0x6b3f35 0x6b6a2c 0x69be0a 0x6aa13f 0x668e46 0x668ec2 0x66a2ce 0x5cbcc4 0x4a4a14 0x4a67e6 0x7f2223434c4d 0x49f669
mongo(_ZN5mongo15printStackTraceERSo+0x21) [0x6073f1]
mongo(_ZN5mongo11msgassertedEiPKc+0x99) [0x5d1aa9]
mongo(_ZNK5mongo11BSONElement4sizeEv+0x1d8) [0x4b0d98]
mongo(_ZN5mongo16resolveBSONFieldEP9JSContextP8JSObjectljPS3_+0x146) [0x5c17a6]
mongo(js_LookupPropertyWithFlags+0x3f5) [0x6b3f35]
mongo(js_GetProperty+0x7c) [0x6b6a2c]
mongo(js_Interpret+0x10ea) [0x69be0a]
mongo(js_Execute+0x36f) [0x6aa13f]
mongo(JS_EvaluateUCScriptForPrincipals+0x66) [0x668e46]
mongo(JS_EvaluateUCScript+0x22) [0x668ec2]
mongo(JS_EvaluateScript+0x6e) [0x66a2ce]
mongo(_ZN5mongo7SMScope4execERKNS_10StringDataERKSsbbbi+0x144) [0x5cbcc4]
mongo(_Z5_mainiPPc+0x26c4) [0x4a4a14]
mongo(main+0x26) [0x4a67e6]
/lib/libc.so.6(__libc_start_main+0xfd) [0x7f2223434c4d]
mongo(__gxx_personality_v0+0x2a1) [0x49f669]
I can run a mongodump of that particular record and then when I convert bson to json I get:
$ bsondump server1_collection.bson > server1_collection.json
terminate called after throwing an instance of 'std::length_error'
what(): basic_string::_S_create
Aborted
Looks like there might be an issue with UTF-8 characters and/or base64 encoded strings causing invalid BSON to be stored: - https://jira.mongodb.org/browse/SERVER-7769
Sounds like moving forward I can start mongod with --objcheck to ensure no invalid data can be inserted (unsure the performance penalties).
Not sure how I can easily scrub through my old data and remove these invalid records though.