Filter the list based on tuple element

2019-09-18 22:29发布

问题:

I have a List[(Int,Int)].

Ex.

val a = List((1,2),(2,3),(1,4),(2,4),(5,6),(4,5),(1,8))

I want to filter this list so that if several tuples have the same first element (same value in _1) then only the first tuple is kept.

So here the expected answer is :

val ans=List((1,2),(2,3),(5,6),(4,5))

As the first element of (1,2) is 1 and the same holds for (1,4)and (1,8), we only keep the first occurrence ((1,2)) and ignore the others ((1,4)and (1,8)).

How can I do this?

回答1:

This preserves order and guarantees only the first is kept when duplicates are found, but it's not a terribly efficient algorithm. Probably not a good idea for long lists.

scala> val a= List((1,2),(2,3),(1,4),(2,4),(5,6),(4,5),(1,8))
a: List[(Int, Int)] = List((1,2), (2,3), (1,4), (2,4), (5,6), (4,5), (1,8))

scala> a.filter(x => a.filter(_._1 == x._1).head == x)
res22: List[(Int, Int)] = List((1,2), (2,3), (5,6), (4,5))


回答2:

a.groupBy(_._1).values.collect{case x::xs=>x}

Edit, as suggested by @Régis Jean-Gilles,

a.groupBy(_._1).values.map(_.head)

Edit, to preserve the order

a.foldLeft(Map.empty[Int,(Int,Int)].empty){
    case (r,(x,y))=> if(r.contains(x)) r else r+(x->(x,y))
}.values


回答3:

ListMap lets us both eliminate duplicate keys and preserve order:

scala> val a= List((1,2),(2,3),(1,4),(2,4),(5,6),(4,5),(1,8))
a: List[(Int, Int)] = List((1,2), (2,3), (1,4), (2,4), (5,6), (4,5), (1,8))

scala> collection.immutable.ListMap(a.reverse: _*).toList.reverse
res0: List[(Int, Int)] = List((1,2), (2,3), (5,6), (4,5))


回答4:

scala> val a = List((1,2),(2,3),(1,4),(2,4),(5,6),(4,5),(1,8))
a: List[(Int, Int)] = List((1,2), (2,3), (1,4), (2,4), (5,6), (4,5), (1,8))
scala> a.reverse.toMap.toList
res7: List[(Int, Int)] = List((1,2), (4,5), (5,6), (2,3))

Doesn't preserve the order, but quite concise.



回答5:

What you're asking for is almost .distinct, with a different test for "duplicate". Let's see how the standard library does .distinct:

 /** Builds a new $coll from this $coll without any duplicate elements.
   *  $willNotTerminateInf
   *
   *  @return  A new $coll which contains the first occurrence of every element of this $coll.
   */
  def distinct: Repr = {
    val b = newBuilder
    val seen = mutable.HashSet[A]()
    for (x <- this) {
      if (!seen(x)) {
        b += x
        seen += x
      }
    }
    b.result()
  }

We can do something a little less generic, but along the same lines:

  import scala.collection.mutable.{ListBuffer, HashSet}

  def distinctByFirst[A,B](xs:List[(A, B)]) = {
    val b = new ListBuffer[(A,B)]
    val seen = HashSet[A]()
    for (x <- xs) {
      if (!seen(x._1)) {
        b += x
        seen += x._1
      }
    }
    b.result()
  }

  distinctByFirst(a)
  //> res0: List[(Int, Int)] = List((1,2), (2,3), (5,6), (4,5))


标签: scala