In Scala 2.8, there is an object in scala.collection.package.scala
:
def breakOut[From, T, To](implicit b : CanBuildFrom[Nothing, T, To]) =
new CanBuildFrom[From, T, To] {
def apply(from: From) = b.apply() ; def apply() = b.apply()
}
I have been told that this results in:
> import scala.collection.breakOut
> val map : Map[Int,String] = List("London", "Paris").map(x => (x.length, x))(breakOut)
map: Map[Int,String] = Map(6 -> London, 5 -> Paris)
What is going on here? Why is breakOut
being called as an argument to my List
?
I'd like to build upon Daniel's answer. It was very thorough, but as noted in the comments, it doesn't explain what breakout does.
Taken from Re: Support for explicit Builders (2009-10-23), here is what I believe breakout does:
It gives the compiler a suggestion as to which Builder to choose implicitly (essentially it allows the compiler to choose which factory it thinks fits the situation best.)
For example, see the following:
You can see the return type is implicitly chosen by the compiler to best match the expected type. Depending on how you declare the receiving variable, you get different results.
The following would be an equivalent way to specify a builder. Note in this case, the compiler will infer the expected type based on the builder's type:
A simple example to understand what
breakOut
does:Daniel Sobral's answer is great, and should be read together with Architecture of Scala Collections (Chapter 25 of Programming in Scala).
I just wanted to elaborate on why it is called
breakOut
:Why is it called
breakOut
?Because we want to break out of one type and into another:
Break out of what type into what type? Lets look at the
map
function onSeq
as an example:If we wanted to build a Map directly from mapping over the elements of a sequence such as:
The compiler would complain:
The reason being that Seq only knows how to build another Seq (i.e. there is an implicit
CanBuildFrom[Seq[_], B, Seq[B]]
builder factory available, but there is NO builder factory from Seq to Map).In order to compile, we need to somehow
breakOut
of the type requirement, and be able to construct a builder that produces a Map for themap
function to use.As Daniel has explained, breakOut has the following signature:
Nothing
is a subclass of all classes, so any builder factory can be substituted in place ofimplicit b: CanBuildFrom[Nothing, T, To]
. If we used the breakOut function to provide the implicit parameter:It would compile, because
breakOut
is able to provide the required type ofCanBuildFrom[Seq[(String, Int)], (String, Int), Map[String, Int]]
, while the compiler is able to find an implicit builder factory of typeCanBuildFrom[Map[_, _], (A, B), Map[A, B]]
, in place ofCanBuildFrom[Nothing, T, To]
, for breakOut to use to create the actual builder.Note that
CanBuildFrom[Map[_, _], (A, B), Map[A, B]]
is defined in Map, and simply initiates aMapBuilder
which uses an underlying Map.Hope this clears things up.
The answer is found on the definition of
map
:Note that it has two parameters. The first is your function and the second is an implicit. If you do not provide that implicit, Scala will choose the most specific one available.
About
breakOut
So, what's the purpose of
breakOut
? Consider the example given for the question, You take a list of strings, transform each string into a tuple(Int, String)
, and then produce aMap
out of it. The most obvious way to do that would produce an intermediaryList[(Int, String)]
collection, and then convert it.Given that
map
uses aBuilder
to produce the resulting collection, wouldn't it be possible to skip the intermediaryList
and collect the results directly into aMap
? Evidently, yes, it is. To do so, however, we need to pass a properCanBuildFrom
tomap
, and that is exactly whatbreakOut
does.Let's look, then, at the definition of
breakOut
:Note that
breakOut
is parameterized, and that it returns an instance ofCanBuildFrom
. As it happens, the typesFrom
,T
andTo
have already been inferred, because we know thatmap
is expectingCanBuildFrom[List[String], (Int, String), Map[Int, String]]
. Therefore:To conclude let's examine the implicit received by
breakOut
itself. It is of typeCanBuildFrom[Nothing,T,To]
. We already know all these types, so we can determine that we need an implicit of typeCanBuildFrom[Nothing,(Int,String),Map[Int,String]]
. But is there such a definition?Let's look at
CanBuildFrom
's definition:So
CanBuildFrom
is contra-variant on its first type parameter. BecauseNothing
is a bottom class (ie, it is a subclass of everything), that means any class can be used in place ofNothing
.Since such a builder exists, Scala can use it to produce the desired output.
About Builders
A lot of methods from Scala's collections library consists of taking the original collection, processing it somehow (in the case of
map
, transforming each element), and storing the results in a new collection.To maximize code reuse, this storing of results is done through a builder (
scala.collection.mutable.Builder
), which basically supports two operations: appending elements, and returning the resulting collection. The type of this resulting collection will depend on the type of the builder. Thus, aList
builder will return aList
, aMap
builder will return aMap
, and so on. The implementation of themap
method need not concern itself with the type of the result: the builder takes care of it.On the other hand, that means that
map
needs to receive this builder somehow. The problem faced when designing Scala 2.8 Collections was how to choose the best builder possible. For example, if I were to writeMap('a' -> 1).map(_.swap)
, I'd like to get aMap(1 -> 'a')
back. On the other hand, aMap('a' -> 1).map(_._1)
can't return aMap
(it returns anIterable
).The magic of producing the best possible
Builder
from the known types of the expression is performed through thisCanBuildFrom
implicit.About
CanBuildFrom
To better explain what's going on, I'll give an example where the collection being mapped is a
Map
instead of aList
. I'll go back toList
later. For now, consider these two expressions:The first returns a
Map
and the second returns anIterable
. The magic of returning a fitting collection is the work ofCanBuildFrom
. Let's consider the definition ofmap
again to understand it.The method
map
is inherited fromTraversableLike
. It is parameterized onB
andThat
, and makes use of the type parametersA
andRepr
, which parameterize the class. Let's see both definitions together:The class
TraversableLike
is defined as:To understand where
A
andRepr
come from, let's consider the definition ofMap
itself:Because
TraversableLike
is inherited by all traits which extendMap
,A
andRepr
could be inherited from any of them. The last one gets the preference, though. So, following the definition of the immutableMap
and all the traits that connect it toTraversableLike
, we have:If you pass the type parameters of
Map[Int, String]
all the way down the chain, we find that the types passed toTraversableLike
, and, thus, used bymap
, are:Going back to the example, the first map is receiving a function of type
((Int, String)) => (Int, Int)
and the second map is receiving a function of type((Int, String)) => String
. I use the double parenthesis to emphasize it is a tuple being received, as that's the type ofA
as we saw.With that information, let's consider the other types.
We can see that the type returned by the first
map
isMap[Int,Int]
, and the second isIterable[String]
. Looking atmap
's definition, it is easy to see that these are the values ofThat
. But where do they come from?If we look inside the companion objects of the classes involved, we see some implicit declarations providing them. On object
Map
:And on object
Iterable
, whose class is extended byMap
:These definitions provide factories for parameterized
CanBuildFrom
.Scala will choose the most specific implicit available. In the first case, it was the first
CanBuildFrom
. In the second case, as the first did not match, it chose the secondCanBuildFrom
.Back to the Question
Let's see the code for the question,
List
's andmap
's definition (again) to see how the types are inferred:The type of
List("London", "Paris")
isList[String]
, so the typesA
andRepr
defined onTraversableLike
are:The type for
(x => (x.length, x))
is(String) => (Int, String)
, so the type ofB
is:The last unknown type,
That
is the type of the result ofmap
, and we already have that as well:So,
That means
breakOut
must, necessarily, return a type or subtype ofCanBuildFrom[List[String], (Int, String), Map[Int, String]]
.