Here's an interesting one, I have a scenario in a bucket sharding system I'm writing where I maintain index hashes and storage hashes, the interrelation is a UUID generated because this is distributed and I want some confidence that new buckets gain unique references.
Early on in this exercise I started optimising the code to freeze all keys generated by SecureRandom.uuid (it produces strings) because when you use a string as a key in a hash gets duped and frozen automatically to ensure that it can't be changed. (if it's a String and not frozen).
In most cases it's easy to aggressively do this, particularly for new UUIDs (actually in my project many such values need this treatment) but in some cases I find I'm having to approach a hash with a value passed over the network and obtain then, to ensure consistent use of any strings present as keys, use a rather obtuse lookup mechanism.
My goal in this, since I want this to maintain a huge data set across multiple nodes, to reduce the overhead of key and index storage as much as possible and because it's a bucketing system the same UUID can be referenced many times and as such it's helpful to use the same reference.
Here's some code that demonstrates the issue in a simpl(ish) form. I'm just asking if there's a more optimum and convenient mechanism for obtaining any pre-existing object reference for a key that has the same string value (for the key name and not the value associated).
# Demonstrate the issue..
require 'securerandom'
index = Hash.new
store = Hash.new
key = 'meh'
value = 1
uuid = SecureRandom.uuid
puts "Ruby dups and freezes strings if used for keys in hashes"
puts "This produces different IDs"
store[uuid] = value
index[key] = uuid
store.each_key { |x| puts "Store reference for value of #{x} #{x.object_id}"}
index.each_value { |x| puts "Index reference for #{x} #{x.object_id}" }
puts
puts "If inconsistencies in ID occur then Ruby attempts to preserve the use of the frozen key so if it happens in one area take care"
puts "This produces different IDs"
uuid = uuid.freeze
store[uuid] = value
index[key] = uuid
store.each_key { |x| puts "Store reference for value of #{x} #{x.object_id}"}
index.each_value { |x| puts "Index reference for #{x} #{x.object_id}" }
puts
puts "If you start with a clean slate and a frozen key you can overcome it if you freeze the string before use"
puts "This is clean so far and produces the same object"
index = Hash.new
store = Hash.new
store[uuid] = value
index[key] = uuid
store.each_key { |x| puts "Store reference for value of #{x} #{x.object_id}"}
index.each_value { |x| puts "Index reference for #{x} #{x.object_id}" }
puts
puts "But if the same value for the key comes in (possibly remote) then it becomes awkward"
puts "This produces different IDs"
uuid = uuid.dup.freeze
store[uuid] = value
index[key] = uuid
store.each_key { |x| puts "Store reference for value of #{x} #{x.object_id}"}
index.each_value { |x| puts "Index reference for #{x} #{x.object_id}" }
puts
puts "So you get into oddities like this to ensure you standarise values put in to keys that already exist"
puts "This cleans up and produces same IDs but is a little awkward"
uuid = uuid.dup.freeze
uuid_list = store.keys
uuid = uuid_list[uuid_list.index(uuid)] if uuid_list.include?(uuid)
store[uuid] = value
index[key] = uuid
store.each_key { |x| puts "Store reference for value of #{x} #{x.object_id}"}
index.each_value { |x| puts "Index reference for #{x} #{x.object_id}" }
puts
Example run...
Ruby dups and freezes strings if used for keys in hashes
This produces different IDs
Store reference for value of bd48a581-95e9-452e-b8a3-602d92d47011 70209306325780
Index reference for bd48a581-95e9-452e-b8a3-602d92d47011 70209306325880
If inconsistencies in ID occur then Ruby attempts to preserve the use of the frozen key so if it happens in one area take care
This produces different IDs
Store reference for value of bd48a581-95e9-452e-b8a3-602d92d47011 70209306325780
Index reference for bd48a581-95e9-452e-b8a3-602d92d47011 70209306325880
If you start with a clean slate and a frozen key you can overcome it if you freeze the string before use
This is clean so far and produces the same object
Store reference for value of bd48a581-95e9-452e-b8a3-602d92d47011 70209306325880
Index reference for bd48a581-95e9-452e-b8a3-602d92d47011 70209306325880
But if the same value for the key comes in (possibly remote) then it becomes awkward
This produces different IDs
Store reference for value of bd48a581-95e9-452e-b8a3-602d92d47011 70209306325880
Index reference for bd48a581-95e9-452e-b8a3-602d92d47011 70209306325000
So you get into oddities like this to ensure you standarise values put in to keys that already exist
This cleans up and produces same IDs but is a little awkward
Store reference for value of bd48a581-95e9-452e-b8a3-602d92d47011 70209306325880
Index reference for bd48a581-95e9-452e-b8a3-602d92d47011 70209306325880