Is there a way to force mongodb to store certain i

2020-06-16 10:08发布

问题:

I have a collection with a relatively big index (but less than ram available) and looking at performance of find on this collection and amount of free ram in my system given by htop it's seems that mongo is not storing full index in the ram. Is there a way to force mongo to store this particular index in the ram?

Example query:

> db.barrels.find({"tags":{"$all": ["avi"]}}).explain()
{
        "cursor" : "BtreeCursor tags_1",
        "nscanned" : 300393,
        "nscannedObjects" : 300393,
        "n" : 300393,
        "millis" : 55299,
        "indexBounds" : {
                "tags" : [
                        [
                                "avi",
                                "avi"
                        ]
                ]
        }
}

Not the all objects are tagged with "avi" tag:

> db.barrels.find().explain()
{
        "cursor" : "BasicCursor",
        "nscanned" : 823299,
        "nscannedObjects" : 823299,
        "n" : 823299,
        "millis" : 46270,
        "indexBounds" : {

        }
}

Without "$all":

db.barrels.find({"tags": ["avi"]}).explain()
{
        "cursor" : "BtreeCursor tags_1 multi",
        "nscanned" : 300393,
        "nscannedObjects" : 300393,
        "n" : 0,
        "millis" : 43440,
        "indexBounds" : {
                "tags" : [
                        [
                                "avi",
                                "avi"
                        ],
                        [
                                [
                                        "avi"
                                ],
                                [
                                        "avi"
                                ]
                        ]
                ]
        }
}

Also this happens when I search for two or more tags (it scans every item as if were no index):

> db.barrels.find({"tags":{"$all": ["avi","mp3"]}}).explain()
{
        "cursor" : "BtreeCursor tags_1",
        "nscanned" : 300393,
        "nscannedObjects" : 300393,
        "n" : 6427,
        "millis" : 53774,
        "indexBounds" : {
                "tags" : [
                        [
                                "avi",
                                "avi"
                        ]
                ]
        }
}

回答1:

No. MongoDB allows the system to manage what is stored in RAM.

With that said, you should be able to keep the index in RAM by running queries against the indexes (check out query hinting) periodically to keep them from getting stale.

Useful References:

  • Checking Server Memory Usage

  • Indexing Advice and FAQ

Additionally, Kristina Chodorow provides this excellent answer regarding the relationship between MongoDB Indexes and RAM


UPDATE:

After the update providing the .explain() output, I see the following:

  • The query is hitting the index.
  • nscanned is the number of items (docs or index entries) examined.
  • nscannedObjects is the number of docs scanned
  • n is the number of docs that match the specified criteria
  • your dataset is 300393 entries, which is the total number of items in the index, and the matching results.

I may be reading this wrong, but what I'm reading is that all of the items in your collection are valid results. Without knowing your data, it would seem that every item contains the tag "avi". The other thing that this means is that this index is almost useless; indexes provide the most value when they work to narrow the resultant field as much as possible.

From MongoDB's "Indexing Advice and FAQ" page:

Understanding explain's output. There are three main fields to look for when examining the explain command's output:

  • cursor: the value for cursor can be either BasicCursor or BtreeCursor. The second of these indicates that the given query is using an index.
  • nscanned: he number of documents scanned.
  • n: the number of documents returned by the query. You want the value of n to be close to the value of nscanned. What you want to avoid is doing a collection scan, that is, where every document in the collection is accessed. This is the case when nscanned is equal to the number of documents in the collection.
  • millis: the number of milliseconds require to complete the query. This value is useful for comparing indexing strategies, indexed vs. non-indexed queries, etc.


回答2:

Is there a way to force mongo to store this particular index in the ram?

Sure, you can walk the index with an index-only query. That will force MongoDB to load every block of the index. But it has to be "index-only", otherwise you will also load all of the associated documents.

The only benefit this will provide is to make some potential future queries faster if those parts of the index are required.

However, if there are parts of the index that are not being accessed by the queries already running, why change this?