pluck vs distinct in mongoid db. which is faster?

2019-07-09 01:09发布

问题:

It seems that mongodb has two equivalent methods:

#pluck and #distinct which both return only given fields from a collection.

so both

User.pluck(:name)
User.distinct(:name)

would return array of all names from User collection in db

> ['john', 'maria', 'tony', 'filip']

I don't mind duplicates. Which method is faster?

回答1:

Let's run a benchmark!

require 'benchmark'


1_200.times { FactoryGirl.create(:user) }

Benchmark.bmbm(7) do |bm|
  bm.report('pluck') do
    User.pluck(:email)
  end

  bm.report('pluck.uniq') do
    User.pluck(:email).uniq
  end

  bm.report('only.pluck') do
    User.only(:email).pluck(:email)
  end

  bm.report('only.pluck.uniq') do
    User.only(:email).pluck(:email).uniq
  end

  bm.report('distinct') do
    User.distinct(:email)
  end

  bm.report('only.distnct') do
    User.only(:email).distinct(:email)
  end
end

which outputs:

Rehearsal ------------------------------------------------
pluck          0.010000   0.000000   0.010000 (  0.009913)
pluck.uniq     0.010000   0.000000   0.010000 (  0.012156)
only.pluck     0.000000   0.000000   0.000000 (  0.008731)
distinct       0.000000   0.000000   0.000000 (  0.004830)
only.distnct   0.000000   0.000000   0.000000 (  0.005048)
--------------------------------------- total: 0.020000sec

                   user     system      total        real

pluck          0.000000   0.000000   0.000000 (  0.007904)
pluck.uniq     0.000000   0.000000   0.000000 (  0.008440)
only.pluck     0.000000   0.000000   0.000000 (  0.008243)
distinct       0.000000   0.000000   0.000000 (  0.004604)
only.distnct   0.000000   0.000000   0.000000 (  0.004510)

it clearly shows that using #distinct is almost two times faster than #pluck