Mongoid Group By or MongoDb group by in rails

2019-01-18 16:40发布

问题:

I have a mongo table that has statistical data like the following....

  • course_id
  • status which is a string, played or completed
  • and timestamp information using Mongoid's Timestamping feature

so my class is as follows...

class Statistic
  include Mongoid::Document
  include Mongoid::Timestamps
  include Mongoid::Paranoia

  field :course_id, type: Integer
  field :status, type: String # currently this is either play or complete

I want to get a daily count of total # of plays for a course. So for example... 8/1/12 had 2 plays, 8/2/12 had 6 plays. Etc. I would therefore be using the created_at timestamp field, with course_id and action. The issue is I don't see a group by method in Mongoid. I believe mongodb has one now, but I'm unsure of how that would be done in rails 3.

I could run through the table using each, and hack together some map or hash in rails with incrementation, but what if the course has 1 million views, retrieving and iterating over a million records could be messy. Is there a clean way to do this?

回答1:

As mentioned in comments you can use map/reduce for this purpose. So you could define the following method in your model ( http://mongoid.org/en/mongoid/docs/querying.html#map_reduce )

def self.today
  map = %Q{
    function() {
      emit(this.course_id, {count: 1})
    }
  }

  reduce = %Q{
    function(key, values) {
      var result = {count: 0};
      values.forEach(function(value) {
        result.count += value.count;
      });
      return result;
    }
  }

  self.where(:created_at.gt => Date.today, status: "played").
    map_reduce(map, reduce).out(inline: true)
end

which would result in following result:

[{"_id"=>1.0, "value"=>{"count"=>2.0}}, {"_id"=>2.0, "value"=>{"count"=>1.0}}] 

where _id is the course_id and count is the number of plays.

There is also dedicated group method in MongoDB but I am not sure how to get to the bare mongodb collection in Mongoid 3. I did not have a chance to dive into code that much yet.

You may wonder why I emit a document {count: 1} as it does not matter that much and I could have just emitted empty document or anything and then always add 1 to the result.count for every value. The thing is that reduce is not called if only one emit has been done for particular key (in my example course_id has been played only once) so it is better to emit documents in the same format as result.



回答2:

Using Mongoid

stages =  [{ 
         "$group" => {  "_id" => { "date_column_name"=>"$created_at" }},
         "plays_count" => { "$sum" => 1 }
    }]
@array_of_objects = ModelName.collection.aggregate(stages, {:allow_disk_use => true})

OR

stages =  [{ 
          "$group" => {  
             "_id" => { 
                       "year" => { "$year" => "$created_at" },
                       "month" => { "$month" => "$created_at" },
                       "day" => { "$dayOfMonth" => "$created_at" }
              }
           },
          "plays_count" => { "$sum" => 1 }
    }]
@array_of_objects = ModelName.collection.aggregate(stages, {:allow_disk_use => true})

Follow the links below to group by using mongoid

https://taimoorchangaizpucitian.wordpress.com/2016/01/08/mongoid-group-by-query/ https://docs.mongodb.org/v3.0/reference/operator/aggregation/group/