Mongoid random document

2019-03-15 18:18发布

问题:

Lets say I have a Collection of users. Is there a way of using mongoid to find n random users in the collection where it does not return the same user twice? For now lets say the user collection looks like this:

class User
  include Mongoid::Document
  field :name
end

Simple huh?

Thanks

回答1:

The best solution is going to depend on the expected size of the collection.

For tiny collections, just get all of them and .shuffle.slice!

For small sizes of n, you can get away with something like this:

result = (0..User.count-1).sort_by{rand}.slice(0, n).collect! do |i| User.skip(i).first end

For large sizes of n, I would recommend creating a "random" column to sort by. See here for details: http://cookbook.mongodb.org/patterns/random-attribute/ https://github.com/mongodb/cookbook/blob/master/content/patterns/random-attribute.txt



回答2:

If you just want one document, and don't want to define a new criteria method, you could just do this:

random_model = Model.skip(rand(Model.count)).first

If you want to find a random model based on some criteria:

criteria = Model.scoped_whatever.where(conditions) # query example
random_model = criteria.skip(rand(criteria.count)).first


回答3:

MongoDB 3.2 comes to the rescue with $sample (link to doc)

EDIT : The most recent of Mongoid has implemented $sample, so you can call YourCollection.all.sample(5)

Previous versions of mongoid

Mongoid doesn't support sample until Mongoid 6, so you have to run this aggregate query with the Mongo driver :

samples = User.collection.aggregate([ { '$sample': { size: 3 } } ])
# call samples.to_a if you want to get the objects in memory

What you can do with that

I believe the functionnality should make its way soon to Mongoid, but in the meantime

module Utility
  module_function
  def sample(model, count)
    ids = model.collection.aggregate([ 
      { '$sample': { size: count } }, # Sample from the collection
      { '$project': { _id: 1} }       # Keep only ID fields
    ]).to_a.map(&:values).flatten     # Some Ruby magic

    model.find(ids)
  end
end

Utility.sample(User, 50)


回答4:

If you really want simplicity you could use this instead:

class Mongoid::Criteria

  def random(n = 1)
    indexes = (0..self.count-1).sort_by{rand}.slice(0,n).collect!

    if n == 1
      return self.skip(indexes.first).first
    else
      return indexes.map{ |index| self.skip(index).first }
    end
  end

end

module Mongoid
  module Finders

    def random(n = 1)
      criteria.random(n)
    end

  end
end

You just have to call User.random(5) and you'll get 5 random users. It'll also work with filtering, so if you want only registered users you can do User.where(:registered => true).random(5).

This will take a while for large collections so I recommend using an alternate method where you would take a random division of the count (e.g.: 25 000 to 30 000) and randomize that range.



回答5:

You can do this by

  1. generate random offset which will further satisfy to pick the next n elements (without exceeding the limit)
  2. Assume count is 10, and the n is 5
  3. to do this check the given n is less than the total count
  4. if no set the offset to 0, and go to step 8
  5. if yes, subtract the n from the total count, and you will get a number 5
  6. Use this to find a random number, the number definitely will be from 0 to 5 (Assume 2)
  7. Use the random number 2 as offset
  8. now you can take the random 5 users by simply passing this offset and the n (5) as a limit.
  9. now you get users from 3 to 7

code

>> cnt = User.count
=> 10
>> n = 5
=> 5
>> offset = 0
=> 0
>> if n<cnt
>>    offset = rand(cnt-n)
>>  end
>> 2
>> User.skip(offset).limit(n)

and you can put this in a method

def get_random_users(n)
  offset = 0
  cnt = User.count
  if n < cnt
    offset = rand(cnt-n)
  end
  User.skip(offset).limit(n)
end

and call it like

rand_users = get_random_users(5)

hope this helps



回答6:

Since I want to keep a criteria, I do:

scope :random, ->{
  random_field_for_ordering = fields.keys.sample
  random_direction_to_order = %w(asc desc).sample
  order_by([[random_field_for_ordering, random_direction_to_order]])
}


回答7:

Just encountered such a problem. Tried

Model.all.sample

and it works for me



回答8:

The approach from @moox is really interesting but I doubt that monkeypatching the whole Mongoid is a good idea here. So my approach is just to write a concern Randomizable that can included in each model you use this feature. This goes to app/models/concerns/randomizeable.rb:

module Randomizable
  extend ActiveSupport::Concern

  module ClassMethods
    def random(n = 1)
      indexes = (0..count - 1).sort_by { rand }.slice(0, n).collect!

      return skip(indexes.first).first if n == 1
      indexes.map { |index| skip(index).first }
    end
  end
end

Then your User model would look like this:

class User
  include Mongoid::Document
  include Randomizable

  field :name
end

And the tests....

require 'spec_helper'

class RandomizableCollection
  include Mongoid::Document
  include Randomizable

  field :name
end

describe RandomizableCollection do
  before do
    RandomizableCollection.create name: 'Hans Bratwurst'
    RandomizableCollection.create name: 'Werner Salami'
    RandomizableCollection.create name: 'Susi Wienerli'
  end

  it 'returns a random document' do
    srand(2)

    expect(RandomizableCollection.random(1).name).to eq 'Werner Salami'
  end

  it 'returns an array of random documents' do
    srand(1)

    expect(RandomizableCollection.random(2).map &:name).to eq ['Susi Wienerli', 'Hans Bratwurst']
  end
end


回答9:

I think it is better to focus on randomizing the returned result set so I tried:

Model.all.to_a.shuffle

Hope this helps.