I'm currently learning Go by doing the rosalind problems (basically a bunch of bioinformatics related code katas).
I'm currently representing a DNA strand with a type:
type DNAStrand struct {
dna byte[]
}
My initial reason was to encapsulate the byte slice so I would know it only contained bytes representing the nucleotides: 'A', 'C', 'G', 'T'
. I realized that this was obviously not guarateed since I could simply do:
DNAStrand{[]byte("foo bar")}
And there is no longer any guarantee that my dna strand contains a byte array with only elements from those four bytes.
Since my struct only contains a byte array is it better/more ideomatic to do:
type DNAStrand []byte
Or is it better to let the type contain the dna strand? Are there any rules of thumb for when to use either of the two approaches?
I'd use
type DNAStrand []byte
because it's simple, and because I can use regexps on it. I'd probably use an initialisation function that checks that every byte is in ACGT though.Struct with zero fields are handy. Structs with many fields are handy even more. Structs with exactly one field are a bit special and I can't think of a reasonably "good" case where to use them - even though they are seen regularly "in the wild". I, for one, don't use them.
Anyway, if you really really need tighter/bulletproof safety about the
DNAStrand
slice content - then it is possible to to use the single field struct and define an argument checking setter method for this/such named type.In that case, if the definition is later used from some other package, there's no way, modulo using package unsafe, to circumvent the checks and get a result equivalent to your
DNAStrand{[]byte("foo bar")}
example.Taking your specific example I would probably do something like this:
Since the nucleotide type is not exported users can't construct their own. You provide the only allowed instances of them in the exported consts so no user can provide their own new nucleotides.