In Scala, algebraic data types are encoded as sealed
one-level type hierarchies. Example:
-- Haskell
data Positioning a = Append
| AppendIf (a -> Bool)
| Explicit ([a] -> [a])
// Scala
sealed trait Positioning[A]
case object Append extends Positioning[Nothing]
case class AppendIf[A](condition: A => Boolean) extends Positioning[A]
case class Explicit[A](f: Seq[A] => Seq[A]) extends Positioning[A]
With case class
es and case object
s, Scala generates a bunch of things like equals
, hashCode
, unapply
(used by pattern matching) etc that brings us many of the key properties and features of traditional ADTs.
There is one key difference though – In Scala, "data constructors" have their own types. Compare the following two for example (Copied from the respective REPLs).
// Scala
scala> :t Append
Append.type
scala> :t AppendIf[Int](Function const true)
AppendIf[Int]
-- Haskell
haskell> :t Append
Append :: Positioning a
haskell> :t AppendIf (const True)
AppendIf (const True) :: Positioning a
I have always considered the Scala variation to be on the advantageous side.
After all, there is no loss of type information. AppendIf[Int]
for instance is a subtype of Positioning[Int]
.
scala> val subtypeProof = implicitly[AppendIf[Int] <:< Positioning[Int]]
subtypeProof: <:<[AppendIf[Int],Positioning[Int]] = <function1>
In fact, you get an additional compile time invariant about the value. (Could we call this a limited version of dependent typing?)
This can be put to good use – Once you know what data constructor was used to create a value, the corresponding type can be propagated through rest of the flow to add more type safety. For example, Play JSON, which uses this Scala encoding, will only allow you to extract fields
from JsObject
, not from any arbitrary JsValue
.
scala> import play.api.libs.json._
import play.api.libs.json._
scala> val obj = Json.obj("key" -> 3)
obj: play.api.libs.json.JsObject = {"key":3}
scala> obj.fields
res0: Seq[(String, play.api.libs.json.JsValue)] = ArrayBuffer((key,3))
scala> val arr = Json.arr(3, 4)
arr: play.api.libs.json.JsArray = [3,4]
scala> arr.fields
<console>:15: error: value fields is not a member of play.api.libs.json.JsArray
arr.fields
^
scala> val jsons = Set(obj, arr)
jsons: scala.collection.immutable.Set[Product with Serializable with play.api.libs.json.JsValue] = Set({"key":3}, [3,4])
In Haskell, fields
would probably have type JsValue -> Set (String, JsValue)
. Which means it will fail at runtime for a JsArray
etc. This problem also manifests in the form of well known partial record accessors.
The view that Scala's treatment of data constructors is wrong has been expressed numerous times – on Twitter, mailing lists, IRC, SO etc. Unfortunately I don't have links to any of those, except for a couple - this answer by Travis Brown, and Argonaut, a purely functional JSON library for Scala.
Argonaut consciously takes the Haskell approach (by private
ing case classes, and providing data constructors manually). You can see that the problem I mentioned with Haskell encoding exists with Argonaut as well. (Except it uses Option
to indicate partiality.)
scala> import argonaut._, Argonaut._
import argonaut._
import Argonaut._
scala> val obj = Json.obj("k" := 3)
obj: argonaut.Json = {"k":3}
scala> obj.obj.map(_.toList)
res6: Option[List[(argonaut.Json.JsonField, argonaut.Json)]] = Some(List((k,3)))
scala> val arr = Json.array(jNumber(3), jNumber(4))
arr: argonaut.Json = [3,4]
scala> arr.obj.map(_.toList)
res7: Option[List[(argonaut.Json.JsonField, argonaut.Json)]] = None
I have been pondering this for quite some time, but still do not understand what makes Scala's encoding wrong. Sure it hampers type inference at times, but that does not seem like a strong enough reason to decree it wrong. What am I missing?