Rails & Mongoid unique results

2019-01-24 20:50发布

问题:

Consider the following example of mongo collection:

{"_id" : ObjectId("4f304818884672067f000001"), "hash" : {"call_id" : "1234"}, "something" : "AAA"}
{"_id" : ObjectId("4f304818884672067f000002"), "hash" : {"call_id" : "1234"}, "something" : "BBB"}
{"_id" : ObjectId("4f304818884672067f000003"), "hash" : {"call_id" : "1234"}, "something" : "CCC"}
{"_id" : ObjectId("4f304818884672067f000004"), "hash" : {"call_id" : "5555"}, "something" : "DDD"}
{"_id" : ObjectId("4f304818884672067f000005"), "hash" : {"call_id" : "5555"}, "something" : "CCC"}

I would like to query this collection and get only the first entry for each "call_id", in other words i'm trying to get unique results based on "call_id". I tried to use .distinct method:

@result = Myobject.all.distinct('hash.call_id')

but the resulting array will contain only the unique call_id fields:

["1234", "5555"]

and I need all the other fields too. Is it possible to make a query like this one?:

@result = Myobject.where('hash.call_id' => Myobject.all.distinct('hash.call_id'))

Thanks

回答1:

You cannot simply return the document(or subset) by using the distinct. As per the documentation it only returns the distinct array of values based on the given key. But you can achieve this by using map-reduce

var _map = function () {
    emit(this.hash.call_id, {doc:this});
}

var _reduce = function (key, values) {
    var ret = {doc:[]};
    var doc = {};
    values.forEach(function (value) {
    if (!doc[value.doc.hash.call_id]) {
           ret.doc.push(value.doc);
           doc[value.doc.hash.call_id] = true; //make the doc seen, so it will be picked only once
       }
    });
    return ret;
}

The above code is self explanatory, on map function i am grouping it by key hash.call_id and returning the whole doc so it can be processed by reduce funcition.

On reduce function, just loop through the grouped result set and pick only one item from the grouped set (among the multiple duplicate key values - distinct simulation).

Finally create some test data

> db.disTest.insert({hash:{call_id:"1234"},something:"AAA"})
> db.disTest.insert({hash:{call_id:"1234"},something:"BBB"})
> db.disTest.insert({hash:{call_id:"1234"},something:"CCC"})
> db.disTest.insert({hash:{call_id:"5555"},something:"DDD"})
> db.disTest.insert({hash:{call_id:"5555"},something:"EEE"})
> db.disTest.find()
{ "_id" : ObjectId("4f30a27c4d203c27d8f4c584"), "hash" : { "call_id" : "1234" }, "something" : "AAA" }
{ "_id" : ObjectId("4f30a2844d203c27d8f4c585"), "hash" : { "call_id" : "1234" }, "something" : "BBB" }
{ "_id" : ObjectId("4f30a2894d203c27d8f4c586"), "hash" : { "call_id" : "1234" }, "something" : "CCC" }
{ "_id" : ObjectId("4f30a2944d203c27d8f4c587"), "hash" : { "call_id" : "5555" }, "something" : "DDD" }
{ "_id" : ObjectId("4f30a2994d203c27d8f4c588"), "hash" : { "call_id" : "5555" }, "something" : "EEE" }

and running this map reduce

> db.disTest.mapReduce(_map,_reduce, {out: { inline : 1}})
{
    "results" : [
        {
            "_id" : "1234",
            "value" : {
                "doc" : [
                    {
                        "_id" : ObjectId("4f30a27c4d203c27d8f4c584"),
                        "hash" : {
                            "call_id" : "1234"
                        },
                        "something" : "AAA"
                    }
                ]
            }
        },
        {
            "_id" : "5555",
            "value" : {
                "doc" : [
                    {
                        "_id" : ObjectId("4f30a2944d203c27d8f4c587"),
                        "hash" : {
                            "call_id" : "5555"
                        },
                        "something" : "DDD"
                    }
                ]
            }
        }
    ],
    "timeMillis" : 2,
    "counts" : {
        "input" : 5,
        "emit" : 5,
        "reduce" : 2,
        "output" : 2
    },
    "ok" : 1,
}

You get the first document of the distinct set. You can do the same in mongoid by first stringify the map/reduce functions and call mapreduce like this

  MyObject.collection.mapreduce(_map,_reduce,{:out => {:inline => 1},:raw=>true })

Hope it helps