I have the following code which increments the first element of every pair in a vector:
(vec (map (fn [[key value]] [(inc key) value]) [[0 :a] [1 :b]]))
However i fear this code is inelegant, as it first creates a sequence using map and then casts it back to a vector.
Consider this analog:
(into [] (map (fn [[key value]] [(inc key) value]) [[0 :a] [1 :b]]))
On #clojure@irc.freenode.net i was told, that using the code above is bad, because into
expands into (reduce conj [] (map-indexed ...))
, which produces many intermediate objects in the process. Then i was told that actually into
doesn't expand into (reduce conj ...)
and uses transients when it can. Also measuring elapsed time showed that into
is actually faster than vec
.
So my questions are:
- What is the proper way to use
map
over vectors?
- What happens underneath, when i use
vec
and into
with vectors?
Related but not duplicate questions:
- Clojure: sequence back to vector
- How To Turn a Reduce-Realized Sequence Back Into Lazy Vector Sequence
Actually as of Clojure 1.4.0 the preferred way of doing this is to use mapv
, which is like map
except its return value is a vector. It is by far the most efficient approach, with no unnecessary intermediate allocations at all.
Clojure 1.5.0 will bring a new reducers library which will provide a generic way to map
, filter
, take
, drop
etc. while creating vectors, usable with into []
. You can play with it in the 1.5.0 alphas and in the recent tagged releases of ClojureScript.
As for (vec some-seq)
and (into [] some-seq)
, the first ultimately delegates to a Java loop which pours some-seq
into an empty transient vector, while the second does the same thing in very efficient Clojure code. In both cases there are some initial checks involved to determine which approach to take when constructing the final return value.
vec
and into []
are significantly different for Java arrays of small length (up to 32) -- the first will alias the array (use it as the tail of the newly created vector) and demands that the array not be modified subsequently, lest the contents of the vector change (see the docstring); the latter creates a new vector with a new tail and doesn't care about future changes to the array.