What are the possible consequences of using unsafe

2019-05-29 04:50发布

The preferred way of converting []byte to string is this:

var b []byte
// fill b
s := string(b)

In this code byte slice is copied, which can be a problem in situations where performance is important.

When performance is critical, one can consider performing the unsafe conversion:

var b []byte
// fill b
s :=  *(*string)(unsafe.Pointer(&b))

My question is: what can go wrong when using the unsafe conversion? I known that string should be immutable and if we change b, s will also be changed. And still: so what? Is it all bad that can happen?

1条回答
我欲成王,谁敢阻挡
2楼-- · 2019-05-29 05:19

Modifying something that the language spec guarantees to be immutable is an act of treason.

Since the spec guarantees that strings are immutable, compilers are allowed to generate code that caches their values and does other optimization based on this. You can't change values of strings in any normal way, and if you resort to dirty ways (like package unsafe) to still do it, you lose all the guarantees provided by the spec, and by continuing to use the modified strings, you may bump into "bugs" and unexpected things randomly.

For example if you use a string as a key in a map and you change the string after you put it into the map, you might not be able to find the associated value in the map using either the original or the modified value of the string (this is implementation dependent).

To demonstrate this, see this example:

m := map[string]int{}
b := []byte("hi")
s := *(*string)(unsafe.Pointer(&b))
m[s] = 999

fmt.Println("Before:", m)

b[0] = 'b'
fmt.Println("After:", m)

fmt.Println("But it's there:", m[s], m["bi"])

for i := 0; i < 1000; i++ {
    m[strconv.Itoa(i)] = i
}
fmt.Println("Now it's GONE:", m[s], m["bi"])
for k, v := range m {
    if k == "bi" {
        fmt.Println("But still there, just in a different bucket: ", k, v)
    }
}

Output (try it on the Go Playground):

Before: map[hi:999]
After: map[bi:<nil>]
But it's there: 999 999
Now it's GONE: 0 0
But still there, just in a different bucket:  bi 999

At first, we just see some weird result: simple Println() is not able to find its value. It sees something (key is found), but value is displayed as nil which is not even a valid value for the value type int (zero value for int is 0).

If we grow the map to be big (we add 1000 elements), internal data structure of the map gets restructured. After this, we're not even able to find our value by explicitly asking for it with the appropriate key. It is still in the map as iterating over all its key-value pairs we find it, but since hash code changes as the value of the string changes, most likely it is searched for in a different bucket than where it is (or where it should be).

Also note that code using package unsafe may work as you expect it now, but the same code might work completely differently (meaning it may break) with a future (or old) version of Go as "packages that import unsafe may be non-portable and are not protected by the Go 1 compatibility guidelines".

Also you may run into unexpected errors as the modified string might be used in different ways. Someone might just copy the string header, someone may copy its content. See this example:

b := []byte{'h', 'i'}
s := *(*string)(unsafe.Pointer(&b))

s2 := s                 // Copy string header
s3 := string([]byte(s)) // New string header but same content
fmt.Println(s, s2, s3)
b[0] = 'b'

fmt.Println(s == s2)
fmt.Println(s == s3)

We created 2 new local variables s2 and s3 using s, s2 initialized by copying the string header of s, and s3 is initialized with a new string value (new string header) but with the same content. Now if you modify the original s, you would expect in a correct program that comparing the new strings to the original you would get the same result be it either true or false (based on if values were cached, but should be the same).

But the output is (try it on the Go Playground):

hi hi hi
true
false
查看更多
登录 后发表回答