Google Go Lang Assignment Order

2020-02-29 19:16发布

问题:

Let's look at the following Go code:

package main

import "fmt"

type Vertex struct {
    Lat, Long float64
}

var m map[string]Vertex

func main() {
    m = make(map[string]Vertex)
    m["Bell Labs"] = Vertex{
        40.68433, 74.39967,
    }
    m["test"] = Vertex{
        12.0, 100,
    }
    fmt.Println(m["Bell Labs"])
    fmt.Println(m)
}

It outputs this:

{40.68433 74.39967}

map[Bell Labs:{40.68433 74.39967} test:{12 100}]

However, if I Change one minor part of the test vertex declaration, by moving the right "}" 4 spaces, like so:

m["test"] = Vertex{
    12.0, 100,
}

.. then the output changes to this:

{40.68433 74.39967}

map[test:{12 100} Bell Labs:{40.68433 74.39967}]

Why the heck does that little modification affect the order of my map?

回答1:

Map "order" depends on the hash function used. The hash function is randomized to prevent denial of service attacks that use hash collisions. See the issue tracker for details:

http://code.google.com/p/go/issues/detail?id=2630

Map order is not guaranteed according to the specification. Although not done in current go implementations, a future implementation could do some compacting during GC or other operation that changes the order of a map without the map being modified by your code. It is unwise to assume a property not defined in the specification.

A map is an unordered group of elements of one type, called the element type, indexed by a set of unique keys of another type, called the key type.



回答2:

A map shouldn't always print its key-element in any fixed order:
See "Go: what determines the iteration order for map keys?"

However, in the newest Go weekly release (and in Go1 which may be expected to be released this month), the iteration order is randomized (it starts at a pseudo-randomly chosen key, and the hashcode computation is seeded with a pseudo-random number).
If you compile your program with the weekly release (and with Go1), the iteration order will be different each time you run your program.

It isn't exactly spelled out like that in the spec though (Ref Map Type):

A map is an unordered group of elements of one type, called the element type, indexed by a set of unique keys of another type, called the key type.

Actually, the specs do spell it out, but in the For statement section:

The iteration order over maps is not specified and is not guaranteed to be the same from one iteration to the next.

  • If map entries that have not yet been reached are deleted during iteration, the corresponding iteration values will not be produced.
  • If map entries are inserted during iteration, the behavior is implementation-dependent, but the iteration values for each entry will be produced at most once.
  • If the map is nil, the number of iterations is 0.

This has been introduced by code review 5285042 in October 2011:

runtime: random offset for map iteration


The go-nuts thread points out:

The reason that "it's there to stop people doing bad stuff" seemed particularly weak.
Avoiding malicious hash collisions makes a lot more sense.
Further the pointer to the code makes it possible in development to revert that behaviour in the case that there is an intermittent bug that's difficult to work through.

To which Patrick Mylund Nielsen replies:

Dan's note was actually the main argument why Python devs were reluctant to adopt hash IV randomization--it broke their unit tests! PHP ultimately chose not to do it at all, and instead limited the size of the http.Request header, and Oracle and others didn't think it was a language problem at all.
Perl saw the problem and applied a fix similar to Go's which was included in Perl 5.8.1 in 2003.
I might be wrong, but I think they were the only ones to actually care then when this paper was presented: "Denial of Service via Algorithmic Complexity Attacks", attacking hash tables.


(Worst-case hash table collisions)

For others, this, which became very popular about a year ago, was a good motivator:
"28c3: Effective Denial of Service attacks against web application platforms (YouTube video, December 2011)", which shows how a common flaw in the implementation of most of the popular web programming languages and platforms (including PHP, ASP.NET, Java, etc.) can be (ab)used to force web application servers to use 99% of CPU for several minutes to hours for a single HTTP request.
This attack is mostly independent of the underlying web application and just relies on a common fact of how web application servers typically work..