According to this Go Data Structures article, under the Strings section it states that taking a slice of a string will keep the original string in memory.
"(As an aside, there is a well-known gotcha in Java and other languages that when you slice a string to save a small piece, the reference to the original keeps the entire original string in memory even though only a small amount is still needed. Go has this gotcha too. The alternative, which we tried and rejected, is to make string slicing so expensive—an allocation and a copy—that most programs avoid it.)"
So if we have a very long string:
s := "Some very long string..."
And we take a small slice:
newS := s[5:9]
The original s
will not be released until we also release newS
. Considering this, what is the proper approach to take if we need to keep newS
long term, but release s
for garbage collection?
I thought maybe this:
newS := string([]byte(s[5:9]))
But I wasn't certain if that would actually work, or if there's a better way.
Yes, converting to a slice of bytes will create a copy of the string, so the original one is not referenced anymore, and can be GCed somewhere down the line.
As a "proof" of this (well, it proves that the slice of bytes doesn't share the same underlying data as the original string):
http://play.golang.org/p/pwGrlETibj
Edit: and proof that the slice of bytes only has the necessary length and capacity (in other words, it doesn't have a capacity equal to that of the original string):
http://play.golang.org/p/3pwZtCgtWv
Edit2: And you can clearly see what happens with the memory profiling. In reuseString(), the memory used is very stable. In copyString(), it grows fast, showing the copies of the string done by the []byte conversion.
http://play.golang.org/p/kDRjePCkXq
The proper way to ensure a string might eventually get eligible for garbage collection after slicing it and keeping the slice "live", is to create a copy of the slice and keeping "live" the copy instead. But now one is buying better memory performance at the cost of worsened time performance. Might be good somewhere, but might be evil elsewhere. Sometimes only proper measurements, not guessing, will tell where the real gain is.
I'm, for example, using StrPack, when I prefer a bit of evilness ;-)