I want to store DNA sequences of size n in the described data structure. Each hash could contain the keys C,G,A,T who will have hash values. These hash values will be the exact same kind of hashes - they will have four keys, C,G,A,T who will have hash values.
This structure is consistent for n levels of hashes. However, the last level of hashes will instead have integer values, which represent the count of the sequence from level 1 to level n.
Given the data ('CG', 'CA', 'TT', 'CG'), indicating that the sequences CG, CA, and TT occurred twice, once, and once. For this data, the depth would be 2.
This would produce a hash: %root = ( 'C' => { 'G' => 2, 'A' => 1}, 'T' => {'T' => 1 })
How would one create this hash from the data?
The following should work:
Produces:
What you need is a function
get_node($tree, 'C', 'G')
returns a reference to the hash element for "CG". Then you can just increment the referenced scalar.The thing is, this function already exists as Data::Diver's
DiveRef
.In both case,
prints