Is cursor.skip() on indexed keys always faster?

2019-08-11 13:48发布

问题:

I have 2 databases: slow and fast; each of which was fed with 4096 entries. The age key is a unique random integer that was generated with this script:

var arr = []
while(arr.length < 4096){
  var randmnum=Math.ceil(Math.random()*1000000)
  var found=false;
  for(var i=0;i<arr.length;i++){
    if(arr[i]==randmnum){found=true;break}
  }
  if(!found)arr[arr.length]=randmnum;
}

var i=0;
for (i=0 ; i< arr.length; ++i) {
    db.fast.insert({name:"john doe", email:"once@upon.a.time.com", age:arr[i]});
    db.slow.insert({name:"john doe", email:"once@upon.a.time.com", age:arr[i]});
}

Then in mongo shell:

> db.createCollection("slow")
> db.createCollection("fast")
> load("theAboveScript.js")
> db.fast.createIndex({"age":1})

If I test find and sort the amount of work that has been done on the fast db is lower than the slow as expected. Moreover the execution time is about 2 times better with the indexed/fast db.

But when I do the following:

> pageNumber=18;nPerPage=20; db.slow.find().skip(pageNumber > 0 ? ((pageNumber-1)*nPerPage) : 0).limit(nPerPage).explain("executionStats")
> pageNumber=18;nPerPage=20; db.fast.find().skip(pageNumber > 0 ? ((pageNumber-1)*nPerPage) : 0).limit(nPerPage).explain("executionStats")

The amount of work is exactly the same and the execution time is similar, even the fast db is a bit slower.

Why cursor.skip() doesn't get any boost if it works on an indexed keys? I would expect the pagination to return the paged data ordered without sorting it explicitly.

回答1:

Neither of your queries are doing a filter on age, so there is no reason to use the index.

If you add a condition on age, there will be a difference (even if minimal with so few documents)

> pageNumber=18;nPerPage=20; db.slow.find({age:{$gt:200}}).
      skip(pageNumber > 0 ? ((pageNumber-1)*nPerPage) : 0).limit(nPerPage).
      explain("executionStats")

# "executionTimeMillis" : 14,
# "inputStage" : {
#     "stage" : "COLLSCAN",

> pageNumber=18;nPerPage=20; db.fast.find({age:{$gt:200}}).
      skip(pageNumber > 0 ? ((pageNumber-1)*nPerPage) : 0).limit(nPerPage).
      explain("executionStats"

# "executionTimeMillis" : 0,
# "inputStage" : {
#     "stage" : "IXSCAN",


回答2:

As displayed you have index on AGE field, so search by indexed field is faster.

The paging issue comes from fact that, you query is not covered which means it need to fetch full document - so index do not come into play in this case.

Please see this for reference