I'm curious about the performance characteristics of joined()
and .flatMap(_:)
in flattening a multidimensional array:
let array = [[1,2,3],[4,5,6],[7,8,9]]
let j = Array(array.joined())
let f = array.flatMap{$0}
They both flatten the nested array
into [1, 2, 3, 4, 5, 6, 7, 8, 9]
. Should I prefer one over the other for performance? Also, is there a more readable way to write the calls?
TL; DR
When it comes just to flattening 2D arrays (without any transformations or separators applied, see @dfri's answer for more info about that aspect),
array.flatMap{$0}
andArray(array.joined())
are both conceptually the same and have similar performance.The main difference between
flatMap(_:)
andjoined()
(note that this isn't a new method, it has just been renamed fromflatten()
) is thatjoined()
is always lazily applied (for arrays, it returns a specialFlattenBidirectionalCollection<Base>
).Therefore in terms of performance, it makes sense to use
joined()
overflatMap(_:)
in situations where you only want to iterate over part of a flattened sequence (without applying any transformations). For example:Because
joined()
is lazily applied &contains(_:)
will stop iterating upon finding a match, only the first two inner arrays will have to be 'flattened' to find the element8
from the 2D array. Although, as @dfri correctly notes below, you are also able to lazily applyflatMap(_:)
through the use of aLazySequence
/LazyCollection
– which can be created through thelazy
property. This would be ideal for lazily applying both a transformation & flattening a given 2D sequence.In cases where
joined()
is iterated fully through, it is conceptually no different from usingflatMap{$0}
. Therefore, these are all valid (and conceptually identical) ways of flattening a 2D array:In terms of performance,
flatMap(_:)
is documented as having a time-complexity of:This is because its implementation is simply:
As
append(contentsOf:)
has a time-complexity of O(n), where n is the length of sequence to append, we get an overall time-complexity of O(m + n), where m will be total length of all sequences appended, and n is the length of the 2D sequence.When it comes to
joined()
, there is no documented time-complexity, as it is lazily applied. However, the main bit of source code to consider is the implementation ofFlattenIterator
, which is used to iterate over the flattened contents of a 2D sequence (which will occur upon usingmap(_:)
or theArray(_:)
initialiser withjoined()
).Here
_base
is the base 2D sequence,_inner
is the current iterator from one of the inner sequences, and_fastPath
&_slowPath
are hints to the compiler to aid with branch prediction.Assuming I'm interpreting this code correctly & the full sequence is iterated through, this also has a time complexity of O(m + n), where m is the length of the sequence, and n is the length of the result. This is because it goes through each outer iterator and each inner iterator to get the flattened elements.
So, performance wise,
Array(array.joined())
andarray.flatMap{$0}
both have the same time complexity.If we run a quick benchmark in a debug build (Swift 3.1):
flatMap(_:)
appears to be the fastest. I suspect thatjoined()
being slower could be due to the branching that occurs within theFlattenIterator
(although the hints to the compiler minimise this cost) – although just whymap(_:)
is so slow, I'm not too sure. Would certainly be interested to know if anyone else knows more about this.However, in an optimised build, the compiler is able to optimise away this big performance difference; giving all three options comparable speed, although
flatMap(_:)
is still fastest by a fraction of a second:(Note that the order in which the tests are performed can affect the results – both of above results are an average from performing the tests in the various different orders)
From the Swiftdoc.org documentation of
Array
(Swift 3.0/dev) we read [emphasis mine]:We may also take a look at the actual implementations of the two in the Swift source code (from which Swiftdoc is generated ...)
Most noteably the latter source file, where the
flatMap
implementations where the used closure (transform
) does not yield and optional value (as is the case here) are all described asFrom the above (assuming the compiler can be clever w.r.t. a simple over self
{ $0 }
transform
), it would seem as if performance-wise, the two alternatives should be equivalent, butjoined
does, imo, better show the intent of the operation.In addition to intent in semantics, there is one apparent use case where
joined
is preferable over (and not entirely comparable to)flatMap
: usingjoined
with it'sinit(separator:)
initializer to join sequences with a separator:The corresponding result using
flatMap
is not really as neat, as we explicitly need to remove the final additional separator after theflatMap
operation (two different use cases, with or without trailing separator)See also a somewhat outdated post of Erica Sadun dicussing
flatMap
vs.flatten()
(note:joined()
was namedflatten()
in Swift < 3).