How do I sort a collection of Lists in lexicograph

2020-02-05 03:49发布

问题:

If A has the Ordered[A] trait, I'd like to be able to have code that works like this

val collection: List[List[A]] = ... // construct a list of lists of As
val sorted = collection sort { _ < _ }

and get something where the lists have been sorted in lexicographic order. Of course, just because A has the trait Ordered[A] doesn't mean that List[A] has the trait Ordered[List[A]]. Presumably, however, the 'scala way' to do this is with an implicit def.

How do I implicitly convert a List[A] to a Ordered[List[A]], assuming A has the trait Ordered[A] (so that the code above just works)?

I have in mind using the lexicographic ordering on List[A] objects, but I'd like code that can be adapted to others orderings.

回答1:

Inspired by Ben Lings' answer, I wrote my own version of sort:

def sort[A : Ordering](coll: Seq[Iterable[A]]) = coll.sorted

which is equivalent to:

def sort[A](coll: Seq[Iterable[A]])(implicit ordering: Ordering[A]) = coll.sorted

Note that ordering is implicitly converted to Ordering[Iterable[A]].

Examples:

scala> def sort[A](coll: Seq[Iterable[A]])(implicit ordering: Ordering[A]) = coll.sorted
sort: [A](coll: Seq[Iterable[A]])(implicit ordering: Ordering[A])Seq[Iterable[A]]

scala> val coll = List(List(1, 3), List(1, 2), List(0), Nil, List(2))
coll: List[List[Int]] = List(List(1, 3), List(1, 2), List(0), List(), List(2))

scala> sort(coll)
res1: Seq[Iterable[Int]] = List(List(), List(0), List(1, 2), List(1, 3), List(2))

It was asked how to supply your own comparison function (say, _ > _ instead of _ < _). It suffices to use Ordering.fromLessThan:

scala> sort(coll)(Ordering.fromLessThan(_ > _))
res4: Seq[Iterable[Int]] = List(List(), List(2), List(1, 3), List(1, 2), List(0))

Ordering.by allows you to map your value into another type for which there is already an Ordering instance. Given that also tuples are ordered, this can be useful for lexicographical comparison of case classes.

To make an example, let's define a wrapper of an Int, apply Ordering.by(_.v), where _.v extracts the underlying value, and show that we obtain the same result:

scala> case class Wrap(v: Int)
defined class Wrap

scala> val coll2 = coll.map(_.map(Wrap(_)))
coll2: List[List[Wrap]] = List(List(Wrap(1), Wrap(3)), List(Wrap(1), Wrap(2)), List(Wrap(0)), List(), List(Wrap(2)))

scala> sort(coll2)(Ordering.by(_.v))
res6: Seq[Iterable[Wrap]] = List(List(), List(Wrap(0)), List(Wrap(1), Wrap(2)), List(Wrap(1), Wrap(3)), List(Wrap(2)))

Finally, let's do the same thing on a case class with more members, reusing the comparators for Tuples:

scala> case class MyPair(a: Int, b: Int)
defined class MyPair

scala> val coll3 = coll.map(_.map(MyPair(_, 0)))
coll3: List[List[MyPair]] = List(List(MyPair(1,0), MyPair(3,0)), List(MyPair(1,0), MyPair(2,0)), List(MyPair(0,0)), List(), List(MyPair(2,0)))

scala> sort(coll3)(Ordering.by(x => (x.a, x.b)))
res7: Seq[Iterable[MyPair]] = List(List(), List(MyPair(0,0)), List(MyPair(1,0), MyPair(2,0)), List(MyPair(1,0), MyPair(3,0)), List(MyPair(2,0)))

EDIT:

My definition of sort above is deprecated in 2.13:

warning: method Iterable in object Ordering is deprecated (since 2.13.0):
Iterables are not guaranteed to have a consistent order; if using a type
with a consistent order (e.g. Seq), use its Ordering (found in the
Ordering.Implicits object)

Use instead:

def sort[A](coll: Seq[Seq[A]])(implicit ordering: Ordering[A]) = {
  import Ordering.Implicits._
  coll.sorted
}


回答2:

Inspired by Ben Lings' answer, I managed to work out what seems like the simplest way to sort lists lexicographically: add the line

import scala.math.Ordering.Implicits._

before doing your List[Int] comparison, to ensure that the implicit function infixOrderingOps is in scope.



回答3:

(11 minutes ago I actually didn't know how to do this, I hope it's considered okay to answer my own question.)

implicit def List2OrderedList[A <% Ordered[A]](list1: List[A]): Ordered[List[A]] = { 
    new Ordered[List[A]] {
        def compare(list2: List[A]): Int = {
            for((x,y) <- list1 zip list2) {
                val c = x compare y
                if(c != 0) return c
            }
            return list1.size - list2.size
        }
    }
}

An important thing to note here is the 'view bound' A <% Ordered[A], which ensures that A needn't itself by an Ordered[A], just that there's a way to do this conversion. Happily, the Scala library's object Predef has an implicit conversion from Ints to RichInts, which in particular are Ordered[Int]s.

The rest of the code is just implementing lexicographic ordering.



回答4:

In 2.8, you should be able to just do collection.sorted. sorted takes an implicit Ordering parameter. Any type that implements Ordered has a corresponding Ordering (thanks to the implicit conversion Ordering.ordered). There is also the implicit Ordering.Iterable that makes an Iterable[T] have an Ordering if T has an Ordering.

However, if you try this it doesn't work:

scala> def sort[A <: Ordered[A]](coll: List[List[A]]) = coll.sorted

<console>:5: error: could not find implicit value for parameter ord: Ordering[List[A]]
       def sort[A <: Ordered[A]](coll: List[List[A]]) = coll.sorted
                                                             ^

You need to explicitly specify that you want the Ordering[Iterable[A]]:

def sort[A <: Ordered[A]](coll: List[List[A]]) = coll.sorted[Iterable[A]]

I'm not sure why the compiler can't find the Ordering[Iterable[A]] if the element type of the collection is List[A].



回答5:

Inspired by Daniel's comment, here is a recursive version:

implicit def toOrdered[A <% Ordered[A]](list1: List[A]): Ordered[List[A]] = { 
  @scala.annotation.tailrec
  def c(list1:List[A], list2:List[A]): Int = {
    (list1, list2) match {
      case (Nil, Nil) => 0
      case (x::xs, Nil) => 1
      case (Nil, y::ys) => -1
      case (x::xs, y::ys) => (x compare y) match {
        case 0 => c(xs, ys)
        case i => i
      }
    }
  }
  new Ordered[List[A]] {
    def compare(list2: List[A]): Int = c(list1, list2)
  }
}

With respect to the comment: I used to think it's more a matter of taste. Sometimes it's easier to verify correctness on a recursive function, and certainly your version is short enough that there is no compelling reason to prefer recursive.

I was intrigued by the performance implications though. So I tried to benchmark it: see http://gist.github.com/468435. I was surprised to see that the recursive version is faster (assuming I did the benchmark correctly). The results still hold true for list of about length 10.



回答6:

Just because I already implemented this another way, here is a non-recursive version that does not use return:

new Ordering[Seq[String]]() {
  override def compare(x: Seq[String], y: Seq[String]): Int = {
    x.zip(y).foldLeft(None: Option[Int]){ case (r, (v, w)) =>
        if(r.isDefined){
          r
        } else {
          val comp = v.compareTo(w)
          if(comp == 0) None
          else Some(comp)
        }
      }.getOrElse(x.size.compareTo(y.size))
    }
  }