Using the following benchmark:
def create_genome
"gattaca" * 100
end
def count_frequency_using_chars(sequence)
100000.times do
sequence.chars.group_by{|x| x}.map{|letter, array| [letter, array.count]}
end
end
def count_frequency_using_count(sequence)
100000.times do
["a", "c", "g", "t"].map{|letter| sequence.count(letter)}
end
end
sequence = create_genome
count_frequency_using_chars(sequence)
count_frequency_using_count(sequence)
I found that, in C-based Ruby for both 1.8 and 1.9.2, using String#count(letter)
is approximately 50 times faster than sorting and counting them using Enumerable#group_by
and Array#count
. I was slightly surprised at this, because the String#count
approach reads through the the string four times each iteration, whereas the latter only reads through it once.
I tried running the code under ruby-prof and perftools.rb, and both of them merely indicated that String#chars
took 90% of the time, with no break-down of where that 90% of time was spent.
If I had to guess why there's a difference, I'd say that creating 70 million single-character strings would be expensive, but how would I be able to know? (Update: String#chars
wasn't the culprit - see the benchmark for mainly_execute_a_trivial_block
)
Edit: Current benchmarks using 1.9.2 patchlevel 180:
require 'pp'
require 'benchmark'
def create_genome
"gattaca" * 100
end
ZILLION = 100000
def count_frequency_using_count(sequence)
ZILLION.times do
["a", "c", "g", "t"].map{|letter| sequence.count(letter)}
end
end
def count_frequency_using_chars(sequence)
ZILLION.times do
sequence.chars.group_by{|x| x}.map{|letter, array| [letter, array.count]}
end
end
def count_frequency_using_inject_hash(sequence)
ZILLION.times do
sequence.chars.inject(Hash.new(0)) { |h, e| h[e] += 1 ; h }
end
end
def count_frequency_using_each_with_object(sequence)
ZILLION.times do
sequence.chars.each_with_object(Hash.new(0)) { |char, hash| hash[char] += 1}
end
end
def just_group_by(sequence)
ZILLION.times do
sequence.chars.group_by{|x| x}
end
end
def just_chars_and_trivial_block(sequence)
ZILLION.times do
sequence.chars() {}
end
end
def mainly_execute_a_trivial_block(sequence)
ZILLION.times do
sequence.length.times() {}
end
end
def execute_an_empty_loop_instead(sequence)
ZILLION.times do
i = 0
max = sequence.length
until i == max
i += 1
end
end
end
sequence = create_genome
puts RUBY_VERSION
Benchmark.bm do |benchmark|
benchmark.report do
count_frequency_using_count(sequence)
end
benchmark.report do
count_frequency_using_chars(sequence)
end
benchmark.report do
count_frequency_using_inject_hash(sequence)
end
benchmark.report do
count_frequency_using_each_with_object(sequence)
end
benchmark.report do
just_group_by(sequence)
end
benchmark.report do
just_chars_and_trivial_block(sequence)
end
benchmark.report do
mainly_execute_a_trivial_block(sequence)
end
benchmark.report do
execute_an_empty_for_loop_instead(sequence)
end
end
Results:
user system total real
0.510000 0.000000 0.510000 ( 0.508499) # count_frequency_using_count
23.470000 0.030000 23.500000 ( 23.575487) # count_frequency_using_chars
32.770000 0.050000 32.820000 ( 32.874634) # count_frequency_using_inject_hash
31.880000 0.040000 31.920000 ( 31.942437) # count_frequency_using_each_with_object
22.950000 0.030000 22.980000 ( 22.970500) # just_group_by
13.300000 0.020000 13.320000 ( 13.314466) # just_chars_and_trivial_block
5.660000 0.000000 5.660000 ( 5.661516) # mainly_execute_a_trivial_block
1.930000 0.010000 1.940000 ( 1.934861) # execute_an_empty_loop_instead
A
fasterslower way would be to pass the array only once.but actually it's NOT faster
This was a total copy & paste fail.
I leave the answer anyway since it shows how you use
Benchmark
from the standard library.The thing that's slow is the group_by. Effectively, while you need to do 4 passes for the count method, the group_by method is a lot slower because it's doing a lot of work in order to do that group_by.
Breaking the code out a bit to have a method which does only the group by:
You can see
While using the group_by to get the information you need is good conceptually, it's doing extra work that you don't need done.
You can see what's going on better by profiling the VM itself.
A block being yielded an excessive number of times is the culprit here. If you use perftools' profiler for the VM, using the instructions listed under "Profiling the Ruby VM and C extensions" at https://github.com/tmm1/perftools.rb (note: this is more or less vanilla perftools, not perftools.rb)
As you can see,
rb_yield_0
is accounting for over a third of the activity, so even if you could optimize everything else, you'd still be slower than if you were usingString#count
.You can also confirm this by doing a benchmark where you're just creating a block that doesn't do anything.
gives
Its nothing to do with the ruby internals. You are comparing apples with oranges.
in your first example, you are grouping 700 char string 100000 times and finding the count. So its a problem in your logic. not in counting.In the second approach you are just counting,
And in both the approaches you are just using count only
just change the first example like this
And its as fast as your second
Edit
This approach is 3x faster than the
count_frequency_using_count
, check the benchmarksAndrew to your comments,
measuring the character composition of 100000 sequences once each, not the character composition of one sequence 100000 times
, still your count approach is too slower than the group_by approach. I just benchmarked the large strings as you saidand changed the methods to handle the multiple sequences
For processing 100000 times, 10 sequences each with 70000 length
Your simple char count approach is 47% slower than the modified group_by approach for the high volume strings. I ran the above benchmark for just 10 sequences each with 70000 length. Assume this for 100 or 1000 sequences, simple count would never an option. right?