GroupBy in scala

2020-02-28 20:26发布

问题:

I have

val a = List((1,2), (1,3), (3,4), (3,5), (4,5))

I am using A.groupBy(_._1) which is groupBy with the first element. But, it gives me output as

Map(1 -> List((1,2) , (1,3)) , 3 -> List((3,4), (3,5)), 4 -> List((4,5)))

But, I want answer as

Map(1 -> List(2, 3), 3 -> List(4,5) , 4 -> List(5))

So, how can I do this?

回答1:

Make life easy with pattern match and Map#withDefaultValue:

scala> a.foldLeft(Map.empty[Int, List[Int]].withDefaultValue(Nil)){ 
         case(r, (x, y)) => r.updated(x, r(x):+y) 
       }
res0: scala.collection.immutable.Map[Int,List[Int]] = 
      Map(1 -> List(2, 3), 3 -> List(4, 5), 4 -> List(5))

There are two points:

  1. Map#withDefaultValue will get a map with a given default value, then you don't need to check if the map contains a key.

  2. When somewhere in scala expected a function value (x1,x2,..,xn) => y, you can always use a pattern matching case(x1,x2,..,xn) => y here, the compiler will translate it to a function auto. Look into 8.5 Pattern Matching Anonymous Functions for more information.

Sorry for my poor english.



回答2:

You can do that by following up with mapValues (and a map over each value to extract the second element):

scala> a.groupBy(_._1).mapValues(_.map(_._2))
res2: scala.collection.immutable.Map[Int,List[Int]] = Map(4 -> List(5), 1 -> List(2, 3), 3 -> List(4, 5))


回答3:

As a variant:

a.foldLeft(Map[Int, List[Int]]()) {case (acc, (a,b)) => acc + (a -> (b::acc.getOrElse(a,List())))}


回答4:

You can also do it with a foldLeft to have only one iteration.

a.foldLeft(Map.empty[Int, List[Int]])((map, t) => 
  if(map.contains(t._1)) map + (t._1 -> (t._2 :: map(t._1))) 
  else map + (t._1 -> List(t._2)))

scala.collection.immutable.Map[Int,List[Int]] = Map(1 -> List(3, 2), 3 ->
       List(5, 4), 4 -> List(5))

If the order of the elements in the lists matters you need to include a reverse.

a.foldLeft(Map.empty[Int, List[Int]])((map, t) => 
  if(map.contains(t._1)) (map + (t._1 -> (t._2 :: map(t._1)).reverse)) 
  else map + (t._1 -> List(t._2)))

scala.collection.immutable.Map[Int,List[Int]] = Map(1 -> List(2, 3), 3 ->
       List(4, 5), 4 -> List(5))


回答5:

As from Scala 2.13 it would be possible to use groupMap so you'd be able to write just:

// val list = List((1, 2), (1, 3), (3, 4), (3, 5), (4, 5))
list.groupMap(_._1)(_._2)
// Map(1 -> List(2, 3), 3 -> List(4, 5), 4 -> List(5))


标签: scala