-->

MongoDB indexes on subdocuments not being used bas

2019-05-18 08:27发布

问题:

I'm using Meteor with MongoDB with a collection of documents like this:

{a: 'a1',
 b: 'b1',
 c: {
     d: 'd1',
     e: 'e1'
    }
}

I initially created an index like this: collection._ensureIndex({'c.d': 1});

And ran queries like this: collection.find({c: {d: 'd1'}). These ran extremely slowly and when I debugged with explain(), I realized they were not using the index.

OTOH, if I ran a query like this: collection.find({'c.d': 'd1'}), then Mongo would use the index.

I've now changed the index to index the entire subdocument i.e. collection._ensureIndex({c: 1}), and the first query now hits the index.

My question is, is this a bug or a feature? My impression was that in JSON, the two notations are equivalent, and quite frankly, I'd expect a database to be smart enough to figure out that the two query terms are referring to the same field and use the appropriate index.

My concern with the workaround of indexing the entire subdocument is that eventually, that subdocument may contain more fields which I don't need indexed, and it seems suboptimal to waste time and RAM holding index terms I don't need.

If this isn't a bug, is there a way to get Mongo to recognize the nested object syntax and use the index properly?

回答1:

I think I figured it out after checking the mongo documents. Basically, there is a difference in semantics between those two. Essentially, in the first form of the query {c: {d: d1}}, mongo assumes that you're specifying the entire subdocument. So if you have a subdocument {c: {d: d1, e: e1}}, it won't match.

OTOH, the second form of the query {'c.d': d1} implies you're only specifying a match on one field within the subdocument. This would match even if the subdocument has other fields or entire sub-subdocuments.

This difference extends to the index. _ensureIndex({c: 1}) and _ensureIndex({'c.d': 1}) are two different indexes, and even though the first one indexes the entire subdocument, it won't be used if you query on an individual field with the c.d notation.