How to use Distinct, Sort, limit with mongodb

2019-04-13 10:40发布

I have a document structure {'text': 'here is text', 'count' : 13, 'somefield': value}

Collection has some thousands of record, and text key value may be repeated many time, I want to find distinct text with highest count value,along with that whole document should be returned , I am able to sort them in descending order.

distinct returns unique value in a list.

I want to use all three functions and document has to be returned, I am still learning and not covered mapreduce.

1条回答
甜甜的少女心
2楼-- · 2019-04-13 11:36

Can you please clarify exactly what you would like to do? Do you want to return documents with unique "text" values with the highest "count" value?

For example, given the collection:

> db.text.find({}, {_id:0})
{ "text" : "here is text", "count" : 13, "somefield" : "value" }
{ "text" : "here is text", "count" : 12, "somefield" : "value" }
{ "text" : "here is text", "count" : 10, "somefield" : "value" }
{ "text" : "other text", "count" : 4, "somefield" : "value" }
{ "text" : "other text", "count" : 3, "somefield" : "value" }
{ "text" : "other text", "count" : 2, "somefield" : "value" }
>
(I have omitted _id values for brevity)

Would you like to return only the documents that contain unique text with the highest 'count' value?

{ "text" : "here is text", "count" : 13, "somefield" : "value" }

and

{ "text" : "other text", "count" : 4, "somefield" : "value" }

One way to do this is with the $group and $max functions in the new aggregation framework. The documentation on $group may be found here: http://docs.mongodb.org/manual/aggregation/

> db.text.aggregate({$group : {_id:"$text", "maxCount":{$max:"$count"}}})
{
    "result" : [
        {
            "_id" : "other text",
            "maxCount" : 4
        },
        {
            "_id" : "here is text",
            "maxCount" : 13
        }
    ],
    "ok" : 1
}

As you can see, the above does not return the original documents. If the original documents are desired, a query may then be done to find documents matching the unique text and count values.

As an alternative, you can first do run the 'distinct' command to return an array of all the distinct values and then run a query for each value with sort and limit to return the document with the highest value of 'count'. The sort() and limit() methods are explained in the "Cursor Methods" section of the "Advanced Queries" documentation: http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-CursorMethods

> var values = db.runCommand({distinct:"text", key:"text"}).values
> values
[ "here is text", "other text" ]
> for(v in values){var c = db.text.find({"text":values[v]}).sort({count:-1}).limit(1); c.forEach(printjson);}
{
    "_id" : ObjectId("4f7b50b2df77a5e0fd4ccbf1"),
    "text" : "here is text",
    "count" : 13,
    "somefield" : "value"
}
{
    "_id" : ObjectId("4f7b50b2df77a5e0fd4ccbf4"),
    "text" : "other text",
    "count" : 4,
    "somefield" : "value"
}

It is unclear if this is exactly what you are trying to do, but I hope that it will at least give you some ideas to get started. If I have misunderstood, please explain in more detail the exact operation that you would like to perform, and hopefully I or another member of the Community will be able to help you out. Thanks.

查看更多
登录 后发表回答