I was looking for an Array equivalent String#split
in Ruby Core, and was surprised to find that it did not exist. Is there a more elegant way than the following to split an array into sub-arrays based on a value?
class Array
def split( split_on=nil )
inject([[]]) do |a,v|
a.tap{
if block_given? ? yield(v) : v==split_on
a << []
else
a.last << v
end
}
end.tap{ |a| a.pop if a.last.empty? }
end
end
p (1..9 ).to_a.split{ |i| i%3==0 },
(1..10).to_a.split{ |i| i%3==0 }
#=> [[1, 2], [4, 5], [7, 8]]
#=> [[1, 2], [4, 5], [7, 8], [10]]
Edit: For those interested, the "real-world" problem which sparked this request can be seen in this answer, where I've used @fd's answer below for the implementation.
I tried golfing it a bit, still not a single method though:
(1..9).chunk{|i|i%3==0}.reject{|sep,ans| sep}.map{|sep,ans| ans}
Or faster:
(1..9).chunk{|i|i%3==0 || nil}.map{|sep,ans| sep&&ans}.compact
Also, Enumerable#chunk
seems to be Ruby 1.9+, but it is very close to what you want.
For example, the raw output would be:
(1..9).chunk{ |i|i%3==0 }.to_a
=> [[false, [1, 2]], [true, [3]], [false, [4, 5]], [true, [6]], [false, [7, 8]], [true, [9]]]
(The to_a
is to make irb print something nice, since chunk
gives you an enumerator rather than an Array)
Edit: Note that the above elegant solutions are 2-3x slower than the fastest implementation:
module Enumerable
def split_by
result = [a=[]]
each{ |o| yield(o) ? (result << a=[]) : (a << o) }
result.pop if a.empty?
result
end
end
Sometimes partition is a good way to do things like that:
(1..6).partition { |v| v.even? }
#=> [[2, 4, 6], [1, 3, 5]]
Here are benchmarks aggregating the answers (I'll not be accepting this answer):
require 'benchmark'
a = *(1..5000); N = 1000
Benchmark.bmbm do |x|
%w[ split_with_inject split_with_inject_no_tap split_with_each
split_with_chunk split_with_chunk2 split_with_chunk3 ].each do |method|
x.report( method ){ N.times{ a.send(method){ |i| i%3==0 || i%5==0 } } }
end
end
#=> user system total real
#=> split_with_inject 1.857000 0.015000 1.872000 ( 1.879188)
#=> split_with_inject_no_tap 1.357000 0.000000 1.357000 ( 1.353135)
#=> split_with_each 1.123000 0.000000 1.123000 ( 1.123113)
#=> split_with_chunk 3.962000 0.000000 3.962000 ( 3.984398)
#=> split_with_chunk2 3.682000 0.000000 3.682000 ( 3.687369)
#=> split_with_chunk3 2.278000 0.000000 2.278000 ( 2.281228)
The implementations being tested (on Ruby 1.9.2):
class Array
def split_with_inject
inject([[]]) do |a,v|
a.tap{ yield(v) ? (a << []) : (a.last << v) }
end.tap{ |a| a.pop if a.last.empty? }
end
def split_with_inject_no_tap
result = inject([[]]) do |a,v|
yield(v) ? (a << []) : (a.last << v)
a
end
result.pop if result.last.empty?
result
end
def split_with_each
result = [a=[]]
each{ |o| yield(o) ? (result << a=[]) : (a << o) }
result.pop if a.empty?
result
end
def split_with_chunk
chunk{ |o| !!yield(o) }.reject{ |b,a| b }.map{ |b,a| a }
end
def split_with_chunk2
chunk{ |o| !!yield(o) }.map{ |b,a| b ? nil : a }.compact
end
def split_with_chunk3
chunk{ |o| yield(o) || nil }.map{ |b,a| b && a }.compact
end
end
Other Enumerable methods you might want to consider is each_slice or each_cons
I don't know how general you want it to be, here's one way
>> (1..9).each_slice(3) {|a| p a.size>1?a[0..-2]:a}
[1, 2]
[4, 5]
[7, 8]
=> nil
>> (1..10).each_slice(3) {|a| p a.size>1?a[0..-2]:a}
[1, 2]
[4, 5]
[7, 8]
[10]
here is another one (with a benchmark comparing it to the fastest split_with_each
here https://stackoverflow.com/a/4801483/410102):
require 'benchmark'
class Array
def split_with_each
result = [a=[]]
each{ |o| yield(o) ? (result << a=[]) : (a << o) }
result.pop if a.empty?
result
end
def split_with_each_2
u, v = [], []
each{ |x| (yield x) ? (u << x) : (v << x) }
[u, v]
end
end
a = *(1..5000); N = 1000
Benchmark.bmbm do |x|
%w[ split_with_each split_with_each_2 ].each do |method|
x.report( method ){ N.times{ a.send(method){ |i| i%3==0 || i%5==0 } } }
end
end
user system total real
split_with_each 2.730000 0.000000 2.730000 ( 2.742135)
split_with_each_2 2.270000 0.040000 2.310000 ( 2.309600)