Scala: type-based list partitioning

2020-03-04 08:53发布

问题:

I have this code that I would like to improve on:

sealed abstract class A
case class B() extends A
case class C() extends A
case class D() extends A

case class Foo[+T <: A](a: T)

/** Puts instances that match Foo(B()) in the first list and everything else,
  * i.e. Foo(C()) and Foo(D()), in the second list. */
def partition(foos: List[Foo[_ <: A]]): (List[Foo[B]], List[Foo[_ <: A]]) = {
  // ...
}

I would like to improve on this in the following respects:

  1. Can I change partition's return type so that it states that there is no Foo[B] in the second list?
  2. Can I get rid of Foo's type parameter T (i.e. change Foo to case class Foo(a: A)) and still declare partition with the same type guarantees? (Obviously, it would have to return something different than (List[Foo], List[Foo]).)

P.S.: Let me know if "shapeless" tag is not relevant for this question.

回答1:

This question is a little tricky because of the way that Scala mixes up algebraic data types (like your A) and subtyping. In most languages with ADTs, B, C, and D wouldn't be types at all—they'd just be "constructors" (in a sense that's similar to but not the same as OOP constructors).

It wouldn't make sense to talk about a Foo[B] in these languages (like Haskell or OCaml), but in Scala you can, because Scala implements ADTs as case classes (and class objects) extending a base trait or class. That doesn't mean you should go around talking about Foo[B], though, and in general if you want to think in FP terms and use the type system to your advantage, it's a good idea not to.

To answer your specific questions:

  1. No, not really in any convenient way. You could use a tagged union (a list with Either[Foo[C], Foo[D]] elements) or something like Shapeless's Coproduct (a list with Foo[C] :+: Foo[D] :+: CNil elements) to represent "a list of things of type A, but not B", but both approaches are fairly heavy machinery for something that probably isn't the best idea in the first place.
  2. I'd recommend not parametrizing Foo on the subtype of A, but if you want to be able to represent "a Foo that contains a B" at the type level, you're going to need to keep your current approach.

To address your postscript: if you want to generalize over ADTs, Shapeless is definitely applicable—see my blog post here about partitioning by constructor, for example. If you're only doing this for A, though, Shapeless probably won't buy you much.


As a footnote, if I really needed a partitioning operation that split out elements of type Foo[B], I'd probably write it like this:

def partition(foos: List[Foo[A]]): (List[Foo[B]], List[Foo[A]]) =
  foos.foldRight((List.empty[Foo[B]], List.empty[Foo[A]])) {
    case (Foo(B()), (bs, others)) => (Foo(B()) :: bs, others)
    case (other, (bs, others)) => (bs, other :: others)
  }

It's not ideal—if we really want a List[Foo[B]], it'd be nice to have a List[Foo[~B]] to represent the leftovers—but it's not too bad.



回答2:

What I'm about to show is not a very flexible solution. Its usefulness depends entirely on what you're trying to model, but I thought it would be an interesting approach nevertheless.

You could introduce a second trait, say T, in the hierarchy and extend all non-B case classes from it.

sealed trait A
sealed trait T extends A

case class B() extends A
case class C() extends A with T
case class D() extends A with T

case class Foo[+T <: A](a: A)

def partition(foos: List[Foo[A]]): (List[Foo[B]], List[Foo[T]]) = ???