I was wondering to what extent Functor
instances in Haskell are determined (uniquely) by the functor laws.
Since ghc
can derive Functor
instances for at least "run-of-the-mill" data types, it seems that they must be unique at least in a wide variety of cases.
For convenience, the Functor
definition and functor laws are:
class Functor f where
fmap :: (a -> b) -> f a -> f b
fmap id = id
fmap (g . h) = (fmap g) . (fmap h)
Questions:
Can one derive the definition of map
starting from the assumption that it is a Functor
instance for data List a = Nil | Cons a (List a)
? If so, what assumptions have to be made in order to do this?
Are there any Haskell data types which have more than one Functor
instances which satisfy the functor laws?
When can ghc
derive a functor
instance and when can't it?
Does all of this depend how we define equality? The Functor
laws are expressed in terms of an equality of values, yet we don't require Functors
to have Eq
instances. So is there some choice here?
Regarding equality, there is certainly a notion of what I call "constructor equality" which allows us to reason that [a,a,a]
is "equal" to [a,a,a]
for any value of a
of any type even if a
does not have (==)
defined for it. All other (useful) notions of equality are probably coarser that this equivalence relationship. But I suspect that the equality in the Functor
laws are more of an "reasoning equality" relationship and can be application specific. Any thoughts on this?
See Brent Yorgey's Typeclassopedia:
Unlike some other type classes we will encounter, a given type has at most one valid instance of Functor. This can be proven via the free theorem for the type of fmap. In fact, GHC can automatically derive Functor instances for many data types.
"When can GHC derive a functor instance and when can't it?"
When we have intentional circular data structures. The type system does not allow us to express our intent of a forced circularity.
So, ghc can derive an instance, similar to that which we want, but not the same.
Circular data structures
are probably the only case where the Functor should be implemented a different way.
But then again, it would have the same semantic.
data HalfEdge a = HalfEdge { label :: a , companion :: HalfEdge a }
instance Functor HalfEdge where
fmap f (HalfEdge a (HalfEdge b _)) = fix $ HalfEdge (f a) . HalfEdge (f b)
EDIT:
HalfEdges are structures (known in computer graphics, 3d meshes...) that represent undirected Edges in a graph, where you can have a reference to either end. Usually they store more references to neighbour HalfEdges, Nodes and Faces.
newEdge :: a -> a -> HalfEdge a
newEdge a b = fix $ HalfEdge a . HalfEdge b
Semantically, there is no fix $ HalfEdge 0 . HalfEdge 1 . HalfEdge 2
, because edges are always composed out of exactly two half edges.
EDIT 2:
In the haskell community, the quote "Tying the Knot" is known for this kind of data structure. It is about data structures that are semantically infinite because they cycle. They consume only finite memory. Example: given ones = 1:ones
, we will have these semantically equivalent implementations of twos
:
twos = fmap (+1) ones
twos = fix ((+1)(head ones) :)
If we traverse (first n elements of) twos
and still have a reference to the begin of that list, these implementations differ in speed (evaluate 1+1 each time vs only once) and memory consumption (O(n) vs O(1)).