Clojure Maps, order of key value creation

2019-06-24 03:03发布

问题:

In a clojure profiling library, there is a an atom storing some profiling information and it is modified by a number of functions. In the comments, they assert that the summary statistics (one of the functions) is called last for efficiency purposes, but I could not see how this was enforced in the actual code.

I then noticed that one manner in which this could enforced is the ordering in the construction of a map, so I recreated a simpler example to demonstrate.

(defn t []
  (let [a (atom {})]
    {:b (swap! a #(assoc % :data 'is-here))
     :c (keys @a)}))

Here, in agreement with the comment, which one of :b or :c comes first in the code will determine the value of a. Of course there must be some ordering, but is this behaviour guaranteed? It seems like it should not be, since an unordered hash has no ordering, and this could have detrimental implications for parallelism.

回答1:

I would consider the evaluation order of map literals an implementation detail and using non-constant expressions with side effects (e.g. setting an atom state) in map literals to be asking for trouble.

For well defined evaluation semantics, use a function call, where the arguments are guaranteed to be evaluated from left to right.

(let [i (atom 0)] 
  (hash-map :a (swap! i inc) :b (swap! i inc) :c (swap! i inc)))
;=> {:a 1, :c 3, :b 2}

Note that the keys are out of order since we used hash-map, but the values correspond to the keys in the order written.

Implementation details

The implementation details of map literals (in version 1.5) depends on several things:

  • The reader treats the unevaluated map literals as an ordered array-map for small maps (8 pairs or fewer) or an unordered hash-map for larger maps.

  • The compiler-evaluator parses the map expression provided by the reader as an unordered hash-map irrespective of size if the keys and values are constant expressions, but will evaluate it as an array-map. Otherwise, the compiler-evaluator evaluates in the key order provided by the reader, dependent on size.

An small example will make this a bit clearer (perhaps):

user=> {:a 1, :b 2, :c 3}
{:a 1, :c 3, :b 2}
user=> (type *1)
clojure.lang.PersistentArrayMap

user=> (def m1 {:a 1, :b 2, :c 3})
#'user/m1
user=> m1
{:a 1, :c 3, :b 2}
user=> (type m1)
clojure.lang.PersistentHashMap
user=> (eval m1)
{:a 1, :c 3, :b 2}
user=> (type *1)
clojure.lang.PersistentArrayMap

But...

user=> (def m2 {:a ((fn [] 1)), :b 2, :c 3})
#'user/m2
user=> m2
{:a 1, :b 2, :c 3}
user=> (type m2)
clojure.lang.PersistentArrayMap


回答2:

I believe, as long as the {...} form remains equal to or less than 8 map entires (key/value pairs), the order will be maintained. A {...} will become a PersistentArrayMap with 8 or fewer entries, and a PersistentHashMap otherwise. The former will preserve the given order of its map entries, a guarantee not extended to the latter. Source.

As for parallelism, I think the following simple test shows that map entries will get evaluated in the order they are given, for maps created with 8 or fewer entries:

user=> (let [x (atom 0)]
         {:a (do (Thread/sleep 1000) (swap! x inc))
          :b (swap! x inc)})
{:a 1, :b 2}

But may evaluate in some other order for maps created with greater than 8 entries:

user=> (let [x (atom 0)]
         {:a (do (Thread/sleep 1000) (swap! x inc))
          :b (swap! x inc)
          :c (swap! x inc)
          :d (swap! x inc)
          :e (swap! x inc)
          :f (swap! x inc)
          :g (swap! x inc)
          :h (swap! x inc)
          :i (swap! x inc)})
{:a 1, :c 2, :b 3, :f 4, :g 5, :d 6, :e 7, :i 8, :h 9}

Though, at least for hash-maps at 9 entries, it doesn't seem like any parallelism is going on during their instantation. That's interesting.