A discussion came up at work recently about Sets, which in Scala support the zip
method and how this can lead to bugs, e.g.
scala> val words = Set("one", "two", "three")
scala> words zip (words map (_.length))
res1: Set[(java.lang.String, Int)] = Set((one,3), (two,5))
I think it's pretty clear that Set
s shouldn't support a zip
operation, since the elements are not ordered. However, it was suggested that the problem is that Set
isn't really a functor, and shouldn't have a map
method. Certainly, you can get yourself into trouble by mapping over a set. Switching to Haskell now,
data AlwaysEqual a = Wrap { unWrap :: a }
instance Eq (AlwaysEqual a) where
_ == _ = True
instance Ord (AlwaysEqual a) where
compare _ _ = EQ
and now in ghci
ghci> import Data.Set as Set
ghci> let nums = Set.fromList [1, 2, 3]
ghci> Set.map unWrap $ Set.map Wrap $ nums
fromList [3]
ghci> Set.map (unWrap . Wrap) nums
fromList [1, 2, 3]
So Set
fails to satisfy the functor law
fmap f . fmap g = fmap (f . g)
It can be argued that this is not a failing of the map
operation on Set
s, but a failing of the Eq
instance that we defined, because it doesn't respect the substitution law, namely that for two instances of Eq
on A and B and a mapping f : A -> B
then
if x == y (on A) then f x == f y (on B)
which doesn't hold for AlwaysEqual
(e.g. consider f = unWrap
).
Is the substition law a sensible law for the Eq
type that we should try to respect? Certainly, other equality laws are respected by our AlwaysEqual
type (symmetry, transitivity and reflexivity are trivially satisfied) so substitution is the only place that we can get into trouble.
To me, substition seems like a very desirable property for the Eq
class. On the other hand, some comments on a recent Reddit discussion include
"Substitution seems stronger than necessary, and is basically quotienting the type, putting requirements on every function using the type."
-- godofpumpkins
"I also really don't want substitution/congruence since there are many legitimate uses for values which we want to equate but are somehow distinguishable."
-- sclv
"Substitution only holds for structural equality, but nothing insists
Eq
is structural."-- edwardkmett
These three are all pretty well known in the Haskell community, so I'd be hesitant to go against them and insist on substitability for my Eq
types!
Another argument against Set
being a Functor
- it is widely accepted that being a Functor
allows you to transform the "elements" of a "collection" while preserving the shape. For example, this quote on the Haskell wiki (note that Traversable
is a generalization of Functor
)
"Where
Foldable
gives you the ability to go through the structure processing the elements but throwing away the shape,Traversable
allows you to do that whilst preserving the shape and, e.g., putting new values in.""
Traversable
is about preserving the structure exactly as-is."
and in Real World Haskell
"...[A] functor must preserve shape. The structure of a collection should not be affected by a functor; only the values that it contains should change."
Clearly, any functor instance for Set
has the possibility to change the shape, by reducing the number of elements in the set.
But it seems as though Set
s really should be functors (ignoring the Ord
requirement for the moment - I see that as an artificial restriction imposed by our desire to work efficiently with sets, not an absolute requirement for any set. For example, sets of functions are a perfectly sensible thing to consider. In any case, Oleg has shown how to write efficient Functor and Monad instances for Set
that don't require an Ord
constraint). There are just too many nice uses for them (the same is true for the non-existant Monad
instance).
Can anyone clear up this mess? Should Set
be a Functor
? If so, what does one do about the potential for breaking the Functor laws? What should the laws for Eq
be, and how do they interact with the laws for Functor
and the Set
instance in particular?