I need to tell if an array contains all of the elements of another array with duplicates.
[1,2,3].contains_all? [1,2] #=> true
[1,2,3].contains_all? [1,2,2] #=> false (this is where (a1-a2).empty? fails)
[2,1,2,3].contains_all? [1,2,2] #=> true
So the first array must contain as many or equal of the number of each unique element in the second array.
This question answers it for those using an array as a set, but I need to control for duplicates.
Update: Benchmarks
On Ruby 1.9.3p194
def bench
puts Benchmark.measure {
10000.times do
[1,2,3].contains_all? [1,2]
[1,2,3].contains_all? [1,2,2]
[2,1,2,3].contains_all? [1,2,2]
end
}
end
Results in:
Rohit 0.100000 0.000000 0.100000 ( 0.104486)
Chris 0.040000 0.000000 0.040000 ( 0.040178)
Sergio 0.160000 0.020000 0.180000 ( 0.173940)
sawa 0.030000 0.000000 0.030000 ( 0.032393)
Update 2: Larger Arrays
@a1 = (1..10000).to_a
@a2 = (1..1000).to_a
@a3 = (1..2000).to_a
def bench
puts Benchmark.measure {
1000.times do
@a1.contains_all? @a2
@a1.contains_all? @a3
@a3.contains_all? @a2
end
}
end
Results in:
Rohit 9.750000 0.410000 10.160000 ( 10.158182)
Chris 10.250000 0.180000 10.430000 ( 10.433797)
Sergio 14.570000 0.070000 14.640000 ( 14.637870)
sawa 3.460000 0.020000 3.480000 ( 3.475513)
If you can't find a method, you can build one using ruby's include? method.
Official documentation: http://www.ruby-doc.org/core-1.9.3/Array.html#method-i-include-3F
Usage:
Then, you can do a loop:
It seems you need a multiset. Check out this gem, I think it does what you need.
You can use is and do something like (if the intersection is equal to the second multiset then the first one includes all of its elements):
Here's a naive and straightforward implementation (not the most efficient one, likely). Just count the elements and compare both elements and their occurrence counts.
Counting the number of occurrences and comparing them seems to be the obvious way to go.
Answering with my own implementation, but definitely want to see if someone can come up with a more efficient way. (I won't accept my own answer)
And the tests:
This solution will only iterate through both lists once, and hence run in linear time. It might however be too much overhead if the lists are expected to be very small.