... or alternatively an Array which prevents duplicate entries.
Is there some kind of object in Ruby which:
- responds to [], []= and <<
- silently drops duplicate entries
- is Enumerable (or at least supports find_all)
- preserves the order in which entries were inserted
As far as I can tell, an Array supports points 1, 3 and 4; while a Set supports 1, 2 and 3 (but not 4). And a SortedSet won't do, because my entries don't implement <=>.
There isn't one as far as I know, and Set by its mathematical nature is meant to be unordered (or at least, implementationally, meant not to guarantee order - in fact its usually implemented as a hash table so it does mess up order).
However, it's not hard to either extend array directly or subclass it to do this. I just tried it out and this works:
Seems to cover all the bases. I used OReilly's handy Ruby Cookbook as a reference - they have a recipe for "Making sure a sorted array stays sorted" which is similar.
As of Ruby 1.9, the built-in
Hash
object preserves insertion order. For example:So, you can set any value (like a simple
true
) for any key and you now have an ordered set. You can test for a key using eitherh.key?(obj)
or, if you always set each key to have a truthy value, justh[obj]
. To remove a key, useh.delete(obj)
. To convert the ordered set to an array, useh.keys
.Because the Ruby 1.9
Set
library happens to be built upon a Hash currently, you can currently useSet
as an ordered set. (For example, theto_a
method's implementation is just@hash.keys
.) Note, however, that this behavior is not guaranteed by that library, and might change in the future.You could use a Hash to store the values, and have an incrementing value stored in the value of each Hash pair. Then you can access the set in a sorted manner, albeit slowly, by accessing the objects via their values.
I'll try to add some code in here later to explain further.
I am aware accessing via values is much slower than by keys.
Update 1: In Ruby 1.9, Hash elements are iterated in their insertion order.
Not that I know of, but it wouldn't be hard to roll your own. Just subclass Array and use a Set to maintain your uniqueness constraint.
One question about silent dropping. How would this affect #[]=? If I was trying to overwrite an existing entry with something which was already stored elsewhere, should it remove the would-be-removed element anyway? I think either way could provide nasty surprises down the road.
I like this solution although it requires active_support's OrderedHash
=)