In a collection of 130k elements with the structure:
{
"tags": ["restaurant", "john doe"]
}
There are 40k documents with "restaurant" tag but only 2 with "john doe". So the next queries are different:
// 0.100 seconds (40.000 objects scanned)
{"tags": {$all: [/^restaurant/, /^john doe/]}}
// 0.004 seconds (2 objects scanned)
{"tags": {$all: [/^john doe/, /^restaurant/]}}
It's there a way to optimize the query without sorting the tags in the client? The only way I can imagine now is putting less frequent tags at start of the search array.
I found a request feature for this in mongodb team JIRA:
I implemented a stadistic system to put tags with more cadinality at the end of the array.