Convert List of tuple to map (and deal with duplic

I was thinking about a nice way to convert a List of tuple with duplicate key [("a","b"),("c","d"),("a","f")] into map ("a" -> ["b", "f"], "c" -> ["d"]). Normally (in python), I'd create an empty map and for-loop over the list and check for duplicate key. But I am looking for something more scala-ish and clever solution here.

btw, actual type of key-value I use here is (Int, Node) and I want to turn into a map of (Int -> NodeSeq)

标签： scala map

8条回答

混吃等死

2楼-- · 2019-01-21 04:54

Group and then project:

scala> val x = List("a" -> "b", "c" -> "d", "a" -> "f")
//x: List[(java.lang.String, java.lang.String)] = List((a,b), (c,d), (a,f))
scala> x.groupBy(_._1).map { case (k,v) => (k,v.map(_._2))}
//res1: scala.collection.immutable.Map[java.lang.String,List[java.lang.String]] = Map(c -> List(d), a -> List(b, f))

More scalish way to use fold, in the way like there (skip map f step).

0人赞添加讨论(0) 举报

叛逆

3楼-- · 2019-01-21 05:02

Starting Scala 2.13, most collections are provided with the groupMap method which is (as its name suggests) an equivalent (more efficient) of a groupBy followed by mapValues:

List("a" -> "b", "c" -> "d", "a" -> "f").groupMap(_._1)(_._2)
// Map[String,List[String]] = Map(a -> List(b, f), c -> List(d))

This:

groups elements based on the first part of tuples (group part of groupMap)
maps grouped values by taking their second tuple part (map part of groupMap)

This is an equivalent of list.groupBy(_._1).mapValues(_.map(_._2)) but performed in one pass through the List.

0人赞添加讨论(0) 举报

Root（大扎）

4楼-- · 2019-01-21 05:03

You can try this

scala> val b = new Array[Int](3)
// b: Array[Int] = Array(0, 0, 0)
scala> val c = b.map(x => (x -> x * 2))
// c: Array[(Int, Int)] = Array((1,2), (2,4), (3,6))
scala> val d = Map(c : _*)
// d: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2, 2 -> 4, 3 -> 6)

0人赞添加讨论(0) 举报

祖国的老花朵

5楼-- · 2019-01-21 05:06

Here's another alternative:

x.groupBy(_._1).mapValues(_.map(_._2))

0人赞添加讨论(0) 举报

我只想做你的唯一

6楼-- · 2019-01-21 05:09

For Googlers that don't expect duplicates or are fine with the default duplicate handling policy:

List("a" -> 1, "b" -> 2).toMap
// Result: Map(a -> 1, c -> 2)

As of 2.12, the default policy reads:

Duplicate keys will be overwritten by later keys: if this is an unordered collection, which key is in the resulting map is undefined.

0人赞添加讨论(0) 举报

冷血范

7楼-- · 2019-01-21 05:10

Below you can find a few solutions. (GroupBy, FoldLeft, Aggregate, Spark)

val list: List[(String, String)] = List(("a","b"),("c","d"),("a","f"))

GroupBy variation

list.groupBy(_._1).map(v => (v._1, v._2.map(_._2)))

Fold Left variation

list.foldLeft[Map[String, List[String]]](Map())((acc, value) => {
  acc.get(value._1).fold(acc ++ Map(value._1 -> List(value._2))){ v =>
    acc ++ Map(value._1 -> (value._2 :: v))
  }
})

Aggregate Variation - Similar to fold Left

list.aggregate[Map[String, List[String]]](Map())(
  (acc, value) => acc.get(value._1).fold(acc ++ Map(value._1 -> 
    List(value._2))){ v =>
     acc ++ Map(value._1 -> (value._2 :: v))
  },
  (l, r) => l ++ r
)

Spark Variation - For big data sets (Conversion to a RDD and to a Plain Map from RDD)

import org.apache.spark.rdd._
import org.apache.spark.{SparkContext, SparkConf}

val conf: SparkConf = new 
SparkConf().setAppName("Spark").setMaster("local")
val sc: SparkContext = new SparkContext (conf)

// This gives you a rdd of the same result
val rdd: RDD[(String, List[String])] = sc.parallelize(list).combineByKey(
   (value: String) => List(value),
   (acc: List[String], value) => value :: acc,
   (accLeft: List[String], accRight: List[String]) => accLeft ::: accRight
)

// To convert this RDD back to a Map[(String, List[String])] you can do the following
rdd.collect().toMap

0人赞添加讨论(0) 举报

1 2 下一页

Convert List of tuple to map (and deal with duplic

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间