Working with scala collections - CanBuildFrom trou

2020-05-14 03:59发布

问题:

I'm trying to write a method which accepts any type of collection CC[_] and maps it to a new collection (the same collection type but a different element type) and I am struggling royally. Basically I'm trying to implement map but not on the collection itself.

The Question

I'm trying to implement a method with a signature which looks a bit like:

def map[CC[_], T, U](cct: CC[T], f: T => U): CC[U]

It's usage would be:

map(List(1, 2, 3, 4), (_ : Int).toString) //would return List[String]

I'm interested in an answer which would also work where CC is Array and I'm interested in the reason my attempts (below) have ultimately not worked.


My Attempts

(For the impatient, in what follows, I utterly fail to get this to work. To reiterate, the question is "how can I write such a method?")

I start like this:

scala> def map[T, U, CC[_]](cct: CC[T], f: T => U)(implicit cbf: CanBuildFrom[CC[T], U, CC[U]]): CC[U] = 
     | cct map f
                                                             ^
 <console>:9: error: value map is not a member of type parameter CC[T]
       cct map f
           ^

OK, that makes sense - I need to say that CC is traversable!

scala> def map[T, U, X, CC[X] <: Traversable[X]](cct: CC[T], f: T => U)(implicit cbf: CanBuildFrom[CC[T], U, CC[U]]): CC[U] = 
     | cct map f
<console>:10: error: type mismatch;
 found   : Traversable[U]
 required: CC[U]
       cct map f
           ^

Err, OK! Maybe if I actually specify that cbf instance. After all, it specifies the return type (To) as CC[U]:

scala> def map[T, U, X, CC[X] <: Traversable[X]](cct: CC[T], f: T => U)(implicit cbf: CanBuildFrom[CC[T], U, CC[U]]): CC[U] = 
     | cct.map(t => f(t))(cbf)
<console>:10: error: type mismatch;
 found   : scala.collection.generic.CanBuildFrom[CC[T],U,CC[U]]
 required: scala.collection.generic.CanBuildFrom[Traversable[T],U,CC[U]]
       cct.map(t => f(t))(cbf)
                          ^

Err, OK! That's a more specific error. Looks like I can use that!

scala> def map[T, U, X, CC[X] <: Traversable[X]](cct: CC[T], f: T => U)(implicit cbf: CanBuildFrom[Traversable[T], U, CC[U]]): CC[U] = 
     | cct.map(t => f(t))(cbf)
map: [T, U, X, CC[X] <: Traversable[X]](cct: CC[T], f: T => U)(implicit cbf: scala.collection.generic.CanBuildFrom[Traversable[T],U,CC[U]])CC[U]

Brilliant. I has me a map! Let's use this thing!

scala> map(List(1, 2, 3, 4), (_ : Int).toString)
<console>:11: error: Cannot construct a collection of type List[java.lang.String] with elements of type java.lang.String based on a collection of type Traversable[Int].
              map(List(1, 2, 3, 4), (_ : Int).toString)
                 ^

Say, what?


Observations

I really can't help but think that Tony Morris' observations about this at the time were absolutely spot on. What did he say? He said "Whatever that is, it is not map". Look at how easy this is in scalaz-style:

scala> trait Functor[F[_]] { def fmap[A, B](fa: F[A])(f: A => B): F[B] }
defined trait Functor

scala> def map[F[_]: Functor, A, B](fa: F[A], f: A => B): F[B] = implicitly[Functor[F]].fmap(fa)(f)
map: [F[_], A, B](fa: F[A], f: A => B)(implicit evidence$1: Functor[F])F[B]

Then

scala> map(List(1, 2, 3, 4), (_ : Int).toString)
<console>:12: error: could not find implicit value for evidence parameter of type Functor[List]
              map(List(1, 2, 3, 4), (_ : Int).toString)
                 ^

So that

scala> implicit val ListFunctor = new Functor[List] { def fmap[A, B](fa: List[A])(f: A => B) = fa map f }
ListFunctor: java.lang.Object with Functor[List] = $anon$1@4395cbcb

scala> map(List(1, 2, 3, 4), (_ : Int).toString)
res5: List[java.lang.String] = List(1, 2, 3, 4)

Memo to self: listen to Tony!

回答1:

What you're running into is not necessarily CanBuildFrom itself, or the Array vs. Seq issue. You're running into String which is not higher-kinded, but supports map against its Chars.

SO: First a digression into Scala's collection design.

What you need is a way to infer both the collection type (e.g. String, Array[Int], List[Foo]) and the element type (e.g. Char, Int, Foo corresponding to the above).

Scala 2.10.x has added a few "type classes" to help you. For example, you can do the following:

class FilterMapImpl[A, Repr](val r: GenTraversableLike[A, Repr]) {
  final def filterMap[B, That](f: A => Option[B])(implicit cbf: CanBuildFrom[Repr, B, That]): That =
    r.flatMap(f(_).toSeq)
 }
 implicit def filterMap[Repr, A](r: Repr)(implicit fr: IsTraversableOnce[Repr]): FilterMapImpl[fr.A,Repr] =
   new FilterMapImpl(fr.conversion(r))

There's two pieces here. FIRST, your class that uses collections needs two type parameters: The specific type of the collection Repr and the type of the elements A.

Next, you define an implicit method which only takes the collection type Repr. You use the IsTraversableOnce (note: there is also an IsTraversableLike) to capture the element type of that collection. You see this used in the type signature FilterMapImpl[Repr, fr.A].

Now, part of this is because Scala does not use the same category for all of its "functor-like" operations. Specifically, map is a useful method for String. I can adjust all characters. However, String can only be a Seq[Char]. If I want to define a Functor, then my category can only contain the type Char and the arrows Char => Char. This logic is captured in CanBuildFrom. However, since a String is a Seq[Char], if you try to use a map in the category supported by Seq's map method, then CanBuildFrom will alter your call to map.

We're essentially defining an "inheritance" relationship for our categories. If you try to use the Functor pattern, we drop the type signature to the most specific category we can retain. Call it what you will; that's a big motivating factor for the current collection design.

End Digression, answer the question

Now, because we're trying to infer a lot of types at the same time, I think this option has the fewest type annotations:

import collection.generic._

def map[Repr](col: Repr)(implicit tr: IsTraversableLike[Repr]) = new {
  def apply[U, That](f: tr.A => U)(implicit cbf: CanBuildFrom[Repr, U, That]) = 
    tr.conversion(col) map f
}


scala> map("HI") apply (_ + 1 toChar )
warning: there were 2 feature warnings; re-run with -feature for details
res5: String = IJ

The important piece to note here is that IsTraversableLike captures a conversion from Repr to TraversableLike that allows you to use the map method.

Option 2

We also split the method call up a bit so that Scala can infer the types Repr and U before we define our anonymous function. To avoid type annotations on anonymous functions, we must have all types known before it shows up. Now, we can still have Scala infer some types, but lose things that are implicitly Traversable if we do this:

import collection.generic._
import collection._
def map[Repr <: TraversableLike[A, Repr], A, U, That](col: Repr with TraversableLike[A,Repr])(f: A => U)(implicit cbf: CanBuildFrom[Repr, U, That]) = 
    col map f

Notice that we have to use Repr with TraversableLike[A,Repr]. It seems that most F-bounded types require this juggling.

In any case, now let's see what happens on something that extends Traversable:

scala> map(List(40,41))(_ + 1 toChar )
warning: there were 1 feature warnings; re-run with -feature for details
res8: List[Char] = List(), *)

That's great. However, if we want the same usage for Array and String, we have to go to a bit more work:

scala> map(Array('H', 'I'): IndexedSeq[Char])(_ + 1 toChar)(breakOut): Array[Char]
warning: there were 1 feature warnings; re-run with -feature for details
res14: Array[Char] = Array(I, J)

scala> map("HI": Seq[Char])(_ + 1 toChar)(breakOut) : String
warning: there were 1 feature warnings; re-run with -feature for details
res11: String = IJ

There are two pieces to this usage:

  1. We have to use a type annotation for the implicit conversion from String/ArraySeq/IndexedSeq.
  2. We have to use breakOut for our CanBuildFrom and type-annotate the expected return value.

This is solely because the type Repr <: TraversableLike[A,Repr] does not include String or Array, since those use implicit conversions.

Option 3

You can place all the implicits together at the end and require the user to annotate types. Not the most elegant solution, so I think I'll avoid posting it unless you'd really like to see it.

SO, basically if you want to include String and Array[T] as collections, you have to jump through some hoops. This category restriction for map applies to both String and BitSet functors in Scala.

I hope that helps. Ping me if you have any more questions.



回答2:

There are actually several questions in there...

Let's start with your last attempt:

scala> def map[T, U, X, CC[X] <: Traversable[X]](cct: CC[T], f: T => U)
 (implicit cbf: CanBuildFrom[Traversable[T], U, CC[U]]): CC[U] = 
  cct.map(t => f(t))(cbf)

This one does compiles but does not work because, according to your type signature, it has to look for an implicit CanBuildFrom[Traversable[Int], String, List[String]] in scope, and there just isn't one. If you were to create one by hand, it would work.

Now the previous attempt:

scala> def map[T, U, X, CC[X] <: Traversable[X]](cct: CC[T], f: T => U)
 (implicit cbf: CanBuildFrom[CC[T], U, CC[U]]): CC[U] = 
  cct.map(t => f(t))(cbf)
<console>:10: error: type mismatch;
 found   : scala.collection.generic.CanBuildFrom[CC[T],U,CC[U]]
 required: scala.collection.generic.CanBuildFrom[Traversable[T],U,CC[U]]
       cct.map(t => f(t))(cbf)
                          ^

This one does not compile because the implicit CanBuildFrom in Traversable is hardcoded to accept only a Traversable as From collection. However, as pointed out in the other answer, TraversableLike knows about the actual collection type (it's its second type parameter), so it defines map with the proper CanBuildFrom[CC[T], U, CC[U]] and everybody is happy. Actually, TraversableLike inherits this map method from scala.collection.generic.FilterMonadic, so this is even more generic:

scala> import scala.collection.generic._
import scala.collection.generic._

scala> def map[T, U, CC[T] <: FilterMonadic[T, CC[T]]](cct: CC[T], f: T => U)
 |  (implicit cbf:  CanBuildFrom[CC[T], U, CC[U]]): CC[U] = cct.map(f)
warning: there were 1 feature warnings; re-run with -feature for details
map: [T, U, CC[T] <: scala.collection.generic.FilterMonadic[T,CC[T]]](cct: CC[T], f: T => U)(implicit cbf: scala.collection.generic.CanBuildFrom[CC[T],U,CC[U]])CC[U]

scala> map(List(1,2,3,4), (_:Int).toString + "k")
res0: List[String] = List(1k, 2k, 3k, 4k)

Finally, the above does not work with arrays because Array is not a FilterMonadic. But there is an implicit conversion from Array to ArrayOps, and the latter implements FilterMonadic. So if you add a view bound in there, you get something that works for arrays as well:

scala> import scala.collection.generic._
import scala.collection.generic._

scala> def map[T, U, CC[T]](cct: CC[T], f: T => U)
 |  (implicit cbf:  CanBuildFrom[CC[T], U, CC[U]], 
 |   ev: CC[T] => FilterMonadic[T,CC[T]]): CC[U] = cct.map(f)
warning: there were 1 feature warnings; re-run with -feature for details
map: [T, U, CC[T]](cct: CC[T], f: T => U)(implicit cbf: scala.collection.generic.CanBuildFrom[CC[T],U,CC[U]], implicit ev: CC[T] => scala.collection.generic.FilterMonadic[T,CC[T]])CC[U]

scala> map(List(1,2,3,4), (_:Int).toString + "k")
res0: List[String] = List(1k, 2k, 3k, 4k)

scala> map(Array(1,2,3,4), (_:Int).toString + "k")
res1: Array[String] = Array(1k, 2k, 3k, 4k)

EDIT: There is also a way to make it work for String and co: just remove the higher kinds on the input/output collection, using a third one in the middle:

def map[T, U, From, To, Middle](cct: From, f: T => U)
 (implicit ev: From => FilterMonadic[T, Middle], 
  cbf: CanBuildFrom[Middle,U,To]): To = cct.map(f)

This works on String and even on Map[A,B]:

scala> map(Array(42,1,2), (_:Int).toString)
res0: Array[java.lang.String] = Array(42, 1, 2)

scala> map(List(42,1,2), (_:Int).toString)
res1: List[java.lang.String] = List(42, 1, 2)

scala> map("abcdef", (x: Char) => (x + 1).toChar)
res2: String = bcdefg

scala> map(Map(1 -> "a", 2 -> "b", 42 -> "hi!"), (a:(Int, String)) => (a._2, a._1))
res5: scala.collection.immutable.Map[String,Int] = Map(a -> 1, b -> 2, hi! -> 42)

Tested with 2.9.2. But as jsuereth pointed out, there is the wonderful IsTraversableLike in 2.10 that is better fitted for this.



回答3:

Is this it?

def map[A,B,T[X] <: TraversableLike[X,T[X]]]
  (xs: T[A])(f: A => B)(implicit cbf: CanBuildFrom[T[A],B,T[B]]): T[B] = xs.map(f)

map(List(1,2,3))(_.toString)
// List[String] = List(1, 2, 3)

See also this question.