... or alternatively an Array which prevents duplicate entries.
Is there some kind of object in Ruby which:
- responds to [], []= and <<
- silently drops duplicate entries
- is Enumerable (or at least supports find_all)
- preserves the order in which entries were inserted
?
As far as I can tell, an Array supports points 1, 3 and 4; while a Set supports 1, 2 and 3 (but not 4). And a SortedSet won't do, because my entries don't implement <=>.
There isn't one as far as I know, and Set by its mathematical nature is meant to be unordered (or at least, implementationally, meant not to guarantee order - in fact its usually implemented as a hash table so it does mess up order).
However, it's not hard to either extend array directly or subclass it to do this. I just tried it out and this works:
class UniqueArray < Array
def initialize(*args)
if args.size == 1 and args[0].is_a? Array then
super(args[0].uniq)
else
super(*args)
end
end
def insert(i, v)
super(i, v) unless include?(v)
end
def <<(v)
super(v) unless include?(v)
end
def []=(*args)
# note: could just call super(*args) then uniq!, but this is faster
# there are three different versions of this call:
# 1. start, length, value
# 2. index, value
# 3. range, value
# We just need to get the value
v = case args.size
when 3 then args[2]
when 2 then args[1]
else nil
end
super(*args) if v.nil? or not include?(v)
end
end
Seems to cover all the bases. I used OReilly's handy Ruby Cookbook as a reference - they have a recipe for "Making sure a sorted array stays sorted" which is similar.
As of Ruby 1.9, the built-in Hash
object preserves insertion order. For example:
h = {}
h[:z] = 1
h[:b] = 2
h[:a] = 3
h[:x] = 0
p h.keys #=> [:z, :b, :a, :x]
h.delete :b
p h.keys #=> [:z, :a, :x]
h[:b] = 1
p h.keys #=> [:z, :a, :x, :b]
So, you can set any value (like a simple true
) for any key and you now have an ordered set. You can test for a key using either h.key?(obj)
or, if you always set each key to have a truthy value, just h[obj]
. To remove a key, use h.delete(obj)
. To convert the ordered set to an array, use h.keys
.
Because the Ruby 1.9 Set
library happens to be built upon a Hash currently, you can currently use Set
as an ordered set. (For example, the to_a
method's implementation is just @hash.keys
.) Note, however, that this behavior is not guaranteed by that library, and might change in the future.
require 'set'
s = Set[ :f, :o, :o, :b, :a, :r ] #=> #<Set: {:f, :o, :b, :a, :r}>
s << :z #=> #<Set: {:f, :o, :b, :a, :r, :z}>
s.delete :o #=> #<Set: {:f, :b, :a, :r, :z}>
s << :o #=> #<Set: {:f, :b, :a, :r, :z, :o}>
s << :o #=> #<Set: {:f, :b, :a, :r, :z, :o}>
s << :f #=> #<Set: {:f, :b, :a, :r, :z, :o}>
s.to_a #=> [:f, :b, :a, :r, :z, :o]
I like this solution although it requires active_support's OrderedHash
require 'active_support/ordered_hash'
class OrderedSet < Set
def initialize enum = nil, &block
@hash = ActiveSupport::OrderedHash.new
super
end
end
=)
You could use a Hash to store the values, and have an incrementing value stored in the value of each Hash pair. Then you can access the set in a sorted manner, albeit slowly, by accessing the objects via their values.
I'll try to add some code in here later to explain further.
I am aware accessing via values is much slower than by keys.
Update 1: In Ruby 1.9, Hash elements are iterated in their insertion order.
Not that I know of, but it wouldn't be hard to roll your own. Just subclass Array and use a Set to maintain your uniqueness constraint.
One question about silent dropping. How would this affect #[]=? If I was trying to overwrite an existing entry with something which was already stored elsewhere, should it remove the would-be-removed element anyway? I think either way could provide nasty surprises down the road.