How to decode an ADT with circe without disambigua

2019-03-08 04:11发布

问题:

Suppose I've got an ADT like this:

sealed trait Event

case class Foo(i: Int) extends Event
case class Bar(s: String) extends Event
case class Baz(c: Char) extends Event
case class Qux(values: List[String]) extends Event

The default generic derivation for a Decoder[Event] instance in circe expects the input JSON to include a wrapper object that indicates which case class is represented:

scala> import io.circe.generic.auto._, io.circe.parser.decode, io.circe.syntax._
import io.circe.generic.auto._
import io.circe.parser.decode
import io.circe.syntax._

scala> decode[Event]("""{ "i": 1000 }""")
res0: Either[io.circe.Error,Event] = Left(DecodingFailure(CNil, List()))

scala> decode[Event]("""{ "Foo": { "i": 1000 }}""")
res1: Either[io.circe.Error,Event] = Right(Foo(1000))

scala> (Foo(100): Event).asJson.noSpaces
res2: String = {"Foo":{"i":100}}

This behavior means that we never have to worry about ambiguities if two or more case classes have the same member names, but it's not always what we want—sometimes we know the unwrapped encoding would be unambiguous, or we want to disambiguate by specifying the order each case class should be tried, or we just don't care.

How can I encode and decodes my Event ADT without the wrapper (preferably without having to write my encoders and decoders from scratch)?

(This question comes up fairly often—see e.g. this discussion with Igor Mazor on Gitter this morning.)

回答1:

Enumerating the ADT constructors

The most straightforward way to get the representation you want is to use generic derivation for the case classes but explicitly defined instances for the ADT type:

import cats.syntax.functor._
import io.circe.{ Decoder, Encoder }, io.circe.generic.auto._
import io.circe.syntax._

sealed trait Event

case class Foo(i: Int) extends Event
case class Bar(s: String) extends Event
case class Baz(c: Char) extends Event
case class Qux(values: List[String]) extends Event

object Event {
  implicit val encodeEvent: Encoder[Event] = Encoder.instance {
    case foo @ Foo(_) => foo.asJson
    case bar @ Bar(_) => bar.asJson
    case baz @ Baz(_) => baz.asJson
    case qux @ Qux(_) => qux.asJson
  }

  implicit val decodeEvent: Decoder[Event] =
    List[Decoder[Event]](
      Decoder[Foo].widen,
      Decoder[Bar].widen,
      Decoder[Baz].widen,
      Decoder[Qux].widen
    ).reduceLeft(_ or _)
}

Note that we have to call widen (which is provided by Cats's Functor syntax, which we bring into scope with the first import) on the decoders because the Decoder type class is not covariant. The invariance of circe's type classes is a matter of some controversy (Argonaut for example has gone from invariant to covariant and back), but it has enough benefits that it's unlikely to change, which means we need workarounds like this occasionally.

It's also worth noting that our explicit Encoder and Decoder instances will take precedence over the generically-derived instances we'd otherwise get from the io.circe.generic.auto._ import (see my slides here for some discussion of how this prioritization works).

We can use these instances like this:

scala> import io.circe.parser.decode
import io.circe.parser.decode

scala> decode[Event]("""{ "i": 1000 }""")
res0: Either[io.circe.Error,Event] = Right(Foo(1000))

scala> (Foo(100): Event).asJson.noSpaces
res1: String = {"i":100}

This works, and if you need to be able to specify the order that the ADT constructors are tried, it's currently the best solution. Having to enumerate the constructors like this is obviously not ideal, though, even if we get the case class instances for free.

A more generic solution

As I note on Gitter, we can avoid the fuss of writing out all the cases by using the circe-shapes module:

import io.circe.{ Decoder, Encoder }, io.circe.generic.auto._
import io.circe.shapes
import shapeless.{ Coproduct, Generic }

implicit def encodeAdtNoDiscr[A, Repr <: Coproduct](implicit
  gen: Generic.Aux[A, Repr],
  encodeRepr: Encoder[Repr]
): Encoder[A] = encodeRepr.contramap(gen.to)

implicit def decodeAdtNoDiscr[A, Repr <: Coproduct](implicit
  gen: Generic.Aux[A, Repr],
  decodeRepr: Decoder[Repr]
): Decoder[A] = decodeRepr.map(gen.from)

sealed trait Event

case class Foo(i: Int) extends Event
case class Bar(s: String) extends Event
case class Baz(c: Char) extends Event
case class Qux(values: List[String]) extends Event

And then:

scala> import io.circe.parser.decode, io.circe.syntax._
import io.circe.parser.decode
import io.circe.syntax._

scala> decode[Event]("""{ "i": 1000 }""")
res0: Either[io.circe.Error,Event] = Right(Foo(1000))

scala> (Foo(100): Event).asJson.noSpaces
res1: String = {"i":100}

This will work for any ADT anywhere that encodeAdtNoDiscr and decodeAdtNoDiscr are in scope. If we wanted it to be more limited, we could replace the generic A with our ADT types in those definitions, or we could make the definitions non-implicit and define implicit instances explicitly for the ADTs we want encoded this way.

The main drawback of this approach (apart from the extra circe-shapes dependency) is that the constructors will be tried in alphabetical order, which may not be what we want if we have ambiguous case classes (where the member names and types are the same).

The future

The generic-extras module provides a little more configurability in this respect. We can write the following, for example:

import io.circe.generic.extras.auto._
import io.circe.generic.extras.Configuration

implicit val genDevConfig: Configuration =
  Configuration.default.withDiscriminator("what_am_i")

sealed trait Event

case class Foo(i: Int) extends Event
case class Bar(s: String) extends Event
case class Baz(c: Char) extends Event
case class Qux(values: List[String]) extends Event

And then:

scala> import io.circe.parser.decode, io.circe.syntax._
import io.circe.parser.decode
import io.circe.syntax._

scala> (Foo(100): Event).asJson.noSpaces
res0: String = {"i":100,"what_am_i":"Foo"}

scala> decode[Event]("""{ "i": 1000, "what_am_i": "Foo" }""")
res1: Either[io.circe.Error,Event] = Right(Foo(1000))

Instead of a wrapper object in the JSON we have an extra field that indicates the constructor. This isn't the default behavior since it has some weird corner cases (e.g. if one of our case classes had a member named what_am_i), but in many cases it's reasonable and it's been supported in generic-extras since that module was introduced.

This still doesn't get us exactly what we want, but it's closer than the default behavior. I've also been considering changing withDiscriminator to take an Option[String] instead of a String, with None indicating that we don't want an extra field indicating the constructor, giving us the same behavior as our circe-shapes instances in the previous section.

If you're interested in seeing this happen, please open an issue, or (even better) a pull request. :)