Is Hask even a category?

2019-04-28 10:19发布

问题:

https://wiki.haskell.org/Hask:

Consider:

undef1 = undefined :: a -> b
undef2 = \_ -> undefined

Note that these are not the same value:

seq undef1 () = undefined
seq undef2 () = ()

This might be a problem, because undef1 . id = undef2. In order to make Hask a category, we define two functions f and g as the same morphism if f x = g x for all x. Thus undef1 and undef2 are different values, but the same morphism in Hask.

What does it mean or how can I check, that: undef1 and undef2 are different values, but the same morphism?

回答1:

In Haskell, we have this idea that every expression can be evaluated to a particular "value", and we might be interested in determining if two expressions have the "same" value.

Informally, we know that some values (e.g., value 2 and 3 of type Integer) can be compared directly. Other values, like sqrt and id of type Double -> Double can be compared, as @pigworker notes, by constructing an expression that "witnesses" a difference in directly comparable values:

sqrt 4 = 2
id 4 = 4

Here, we can conclude that sqrt and id are different values. If there is no such witness, then the values are the same.

If we look at the monomorphic specializations of undef1 and undef2 to the type () -> () given by:

undef1, undef2 :: () -> ()
undef1 = undefined
undef2 = \_ -> undefined

how can we tell if these are different values?

Well, we need to find an expression that witnesses a difference, and one is given above. The two expressions:

> seq undef1 ()
*** Exception: Prelude.undefined
> seq undef2 ()
()
>

have different values, according to GHCi. We could also show this directly, using our understanding of Haskell semantics:

seq undef1 ()
-- use defn of undef1
= seq undefined ()
-- seq semantics:  WHNF of undefined is _|_, so value is _|_
= _|_

seq undef2 ()
-- use defn of undef2
= seq (\_ -> undefined) ()
-- seq semantics: (\_ -> undefined) is already in WHNF and is not _|_,
--   so value is second arg ()
= ()

So, what's the problem? Well, when considering Hask as a category where the objects are types and the morphisms are (monomorphic) functions, we implicitly need a concept of identity/equality for objects and morphisms.

Object identity/equality is easy: two objects (monomorphic Haskell types) are equal if and only if they are the same type. Morphism identity/equality is harder. Because morphisms in Hask are Haskell values (of monomorphic function types), it would be tempting to define equality of morphisms to be the same as equality of values, as above.

If we used this definition, then undef1 and undef2 would be different morphisms, because we have proved they are different Haskell values above.

However, if we compared undef1 . id and undef2, we would discover that they have the same value. That is, there is no expression that witnesses a difference between them. Proving this is a little tough, but see below.

Anyway, we now have a contradiction in our Hask category theory. Because id is the (polymorphic family of) identity morphisms in Hask, we must have:

undef1 
= undef1 . id                            -- because `id` is identity
= undef2                                 -- same value

so we simultaneously have undef1 /= undef2 because of the witness above, but undef1 = undef2 by the previous argument.

The only way to avoid this contradiction is to give up the idea of defining equality of morphisms in Hask as equality of the underlying Haskell values.

One alternative definition of equality of morphisms in Hask that has been offered is the weaker definition that two morphisms f and g are equal if and only if they satisfy f x = g x for all values x (including _|_). Note that there's still an ambiguity here. If f x and g x are themselves Haskell functions and so morphisms, does f x = g x mean equality of the morphisms f x and g x or equality of the Haskell values f x and g x? Let's ignore this problem for now.

Under this alternative definition, undef1 and undef2 are equal as morphisms, because we can show undef1 x = undef2 x for all possible values x of type () (namely () and _|_). That is, applied to (), they give:

undef1 ()
-- defn of undef1
= undefined ()
-- application of an undefined function
= _|_

undef2 ()
-- defn of undef2
= (\_ -> undefined) ()
-- application of a lambda
= undefined
-- semantics of undefined
= _|_

and applied to _|_ they give:

undef1 _|_
-- defn of undef1
= undefined _|_
-- application of an undefined function
= _|_

undef2 _|_
-- defn of undef2
= (\_ -> undefined) _|_
-- application of a lambda
= undefined
-- semantics of undefined
= _|_

Similarly, undef1 . id and undef2 can be shown to be equal as morphisms in Hask by this definition (in fact, they were equal as Haskell values which implies they're equal according to the weaker definition of equality of Hask morphisms), so there's no contradiction.

However, if you follow the link provided by @n.m., you can see that there's more work to do in terms of formalizing the meaning of equality of Haskell values and precisely giving an appropriate definition of equality of Hask morphisms before we'd really feel comfortable believing that there is a contradiction-free Hask category.

Proof that undef1 . id = undef2 as Haskell Values

For the reasons above, this proof is necessarily a little informal, but here's the idea.

If we are trying to witness a difference between functions f and g, the only way a witnessing expression can use these values is by applying them to a value x or by evaluating them to WHNF using seq. If f and g are known to be equal as Hask morphisms, then we already have f x = g x for all x, so no expression can witness a difference based on application. The only remaining thing to check is that when they are evaluated to WHNF they are either both defined (in which case, by the previous assumption, they will yield the same values when applied) or they are both undefined.

So, for undef1 . id and undef2, we just need to ensure that they are either both defined or both undefined when evaluated to WHNF. It's easy to see that they both, in fact, have defined WHNFs:

undef1 . id 
-- defn of composition
= \x -> undef1 (id x)
-- this is a lambda, so it is in WHNF and is defined

undef2
-- defn of undef2
= \_ -> undefined
-- this is a lambda, so it is in WHNF and is defined

We already established above that undef1 x = undef2 x for all x. Technically, we ought to show that:

(undef1 . id) x
-- defn of composition
= (\x -> undef1 (id x)) x
-- lambda application
= undef1 (id x)
-- defn of id
= undef1 x

with equality as Haskell values for all x, which establishes that (undef1 . id) x = undef2 x for all x. Together with the fact that both have defined WHNFs above, this is enough to show that undef1 . id = undef2 with equality as Haskell values.