sparse indexes and null values in mongo

2020-01-31 04:34发布

问题:

I'm not sure I understand sparse indexes correctly.

I have a sparse unique index on fbId

{
    "ns" : "mydb.users",
    "key" : {
        "fbId" : 1
    },
    "name" : "fbId_1",
    "unique" : true,
    "sparse" : true,
    "background" : false,
    "v" : 0
}

And I was expecting that would allow me to insert records with null as the fbId, but that throws a duplicate key exception. It only allows me to insert if the fbId property is removed completely.

Isn't a sparse index supposed to deal with that?

回答1:

Sparse indexes do not contain documents that miss indexed field. However, if field exists and has value of null, it will still be indexed. So, if absense of the field and its equality to null look the same for your application and you want to maintain uniqueness of fbId, just don't insert it until you have a value for it.

You need sparse indexes when you have a large number of documents, but only a small portion of them contains some field, and you want to be able to quickly find documents by that field. Creating a normal index would be too expensive, you would just waste precious RAM on indexing documents you're not interested in.



回答2:

To ensure maximum performance of the indexes, we may want to omit from indexing those documents NOT containing the field on which you are performing an index. To do this MongoDB has the sparse property that works as follows:

db.addresses.ensureIndex( { "secondAddress": 1 }, { sparse: true } );

This index will omit all the documents not containing the secondAddress field and when performing a query, those document will never be scanned.

Let me share this article about basic indexes and some of their properties:

Geospatial, Text, Hash indexes and unique and sparse properties: http://mongodbspain.com/en/2014/02/03/mongodb-indexes-part-2-geospatial-2d-2dsphere/



回答3:


{a:1, b:5, c:2}
{a:8, b:15, c:7}
{a:4, b:7}
{a:3, b:10}

Let's assume that we wish to create an index on the above documents. Creating index on a & b will not be a problem. But what if we need to create an index on c. The unique constraint will not work for c keys because null value is duplicated for 2 documents. The solution in this case is to use sparse option. This option tells the database to not include the documents which misses the key. The command in concern is db.collectionName.createIndex({thing:1}, {unique:true, sparse:true}). The sparse index lets us use less space as well.

Notice that even if we have a sparse index, the database performs all documents scan especially when doing sort. This can be seen in the winning plan section of explain's result.