-->

Compute an (infinite) tree from fixpoint operator

2020-06-23 06:48发布

问题:

Here is a functional programming puzzle involving loop-tying and infinite data structures. There is a bit of background, so hang tight.

The setting. Let us define a data type representing recursive data types:

type Var = String
data STerm = SMu Var STerm
           | SVar Var
           | SArrow STerm STerm
           | SBottom
           | STop
   deriving (Show)

i.e. t ::= μα. t | α | t → t | ⊥ | ⊤. Note that ⊥ denotes the type with no inhabitants, while ⊤ denotes the type with all inhabitants. Note that (μα. α) = ⊥, as μ is a least fixpoint operator.

We can interpret a recursive data type as an infinite tree, arising from repeatedly unfolding μα. t to t[α ↦ μα. t]. (For a formal description of this process, see http://lucacardelli.name/Papers/SRT.pdf) In Haskell, we can define a type of lazy trees, which don't have μ-binders or variables:

data LTerm = LArrow LTerm LTerm
           | LBottom
           | LTop
   deriving (Show)

and, in ordinary Haskell, a conversion function from one to the other:

convL :: [(Var, LTerm)] -> STerm -> LTerm
convL _ STop    = LTop
convL _ SBottom = LBottom
convL ctx (SArrow t1 t2) = LArrow (convL ctx t1) (convL ctx t2)
convL ctx (SVar v)
    | Just l <- lookup v ctx = l
    | otherwise = error "unbound variable"
convL ctx (SMu v t) = fix (\r -> convL ((v, r) : ctx) t)

However, there is a problem with this function: it's not productive. If you run convL [] (SMu "x" (SVar "x")) you will infinite loop. We would rather get LBottom in this case. An interesting exercise is to directly fix this function so that it is productive; however, in this question I want to solve the problem differently.

Productivity with the delay modality. When we build cyclic data structures as above, we need to make sure we do not use the results of our computations before we have constructed them. The delay modality is a way of ensuring that we write productive (non infinite looping) programs. The basic idea is simple: if the type Int means that I have an integer today, I define a type constructor D, so that D Int means that I have a value of type Int tomorrow. D is a Functor and an Applicative (but NOT a monad.)

-- D is abstract; you are not allowed to pattern match on it
newtype D a = D a
   deriving (Show)

instance Functor D where
    fmap f (D a) = D (f a)

instance Applicative D where
    D f <*> D a = D (f a)
    pure x = D x

With D, we define a fixpoint operator: it says that to construct a value of a, you can have access to the a you are constructing, as long as you only use it tomorrow.

fixD :: (D a -> a) -> a
fixD f = f (D (fixD f))

For example, a Stream consists both of a value a I have today, and a stream Stream a which I have to produce tomorrow.

data Stream a = Cons a (D (Stream a))

Using fixD, I can define a map function on streams which is guaranteed to be productive, since the recursive call to map is only used to produced values that are needed tomorrow.

instance Functor Stream where
    fmap f = fixD $ \go (Cons x xs) -> Cons (f x) (go <*> xs)

The problem. Here is a variant of LTerm with an explicit delay modality.

data Term = Arrow (D Term) (D Term)
          | Bottom
          | Top
    deriving (Show)

Using fixD (no non-structurally recursive references allowed), how do I write a function conv :: STerm -> Term (or conv :: STerm -> D Term)? A particularly interesting test case is SMu "x" (SArrow STop (SMu "y" (SVar "x"))); there should be no Bottoms in the resulting structure!

Update. I accidentally ruled out structural recursion on STerm, which was not the intent of the question; I've reworded to remove that restriction.

回答1:

Do you intend to forbid just the unrestricted recursion (fix) in the SMu case of convL, or also the structural recursion in the SArrow case?

I don’t think this has a solution without structural recursion on STerm, because then we would have to be productive even on an infinite STerm such as:

foldr (\n -> SMu ("x" ++ show n)) undefined [0..]  -- μα. μβ. μγ. μδ. …

To do this with structural recursion on STerm, it seems the trick is to store Either Term (D Term) in the context. When we pass through an Arrow and produce a D, we can convert all the Rights to Lefts.

type Ctx = [(Var, Either Term (D Term))]

dCtx :: Ctx -> D Ctx
dCtx = traverse (traverse (fmap Left . either pure id))

conv :: STerm -> Ctx -> Term
conv STop _ = Top
conv SBottom _ = Bottom
conv (SArrow t1 t2) ctx = Arrow (fmap (conv t1) (dCtx ctx)) (fmap (conv t2) (dCtx ctx))
conv (SVar v) ctx = case lookup v ctx of
                      Nothing -> error "unbound variable"
                      Just (Left t) -> t
                      Just (Right _) -> Bottom
conv (SMu v t) ctx = fixD (\dr -> conv t ((v, Right dr) : ctx))


回答2:

My intuition is that the context should contain only delayed terms. That way, conv ctx (SMu "x" t) will be equivalent to fixD (\d -> conv ((x,r):ctx) t), as in the original convL.

If this is the case, then you need a general way to include delayed terms in your data structure, instead of just allowing them in arrows:

data Term = Arrow Term Term
          | Bottom
          | Top
          | Next (D Term)

A first attempt at conv gives us:

conv :: [(Var, D Term)] -> STerm -> Term
conv _ STop = Top
conv _ SBottom = SBottom
conv ctx (SArrow t1 t2) = Arrow (conv ctx t1) (conv ctx t2)
conv ctx (SVar v)
  | Just l <- lookup v ctx = Next l
  | otherwise = error "unbound variable"
conv ctx (SMu v t) = fixD (\r -> conv ((x,r):ctx) t)

However, this uses unguarded recursive calls to conv. If you want to avoid that, you can wrap all the fixD recursive calls in a Next.

conv :: [(Var, D Term)] -> STerm -> Term
conv = fixD step where
        step _ _ STop = Top
        step _ _ SBottom = Bottom
        step d ctx (SArrow t1 t2) = Arrow (Next $ d <*> pure ctx <*> pure t1) 
                                          (Next $ d <*> pure ctx <*> pure t2)
        step d ctx (SVar v)
          | Just l <- lookup v ctx = Next l
          | otherwise = error "unbound variable"
        step d ctx (SMu v t) = fixD (\r -> 
                                    Next $ d <*> pure ((x,r):ctx) <*> pure t)

I'm not sure if this is exactly what you are looking for, because conv [] SMu "x" (SArrow SBottom (SMu "y" (SVar "x"))) does still have bottoms in the resulting structure. What is the type you want to get out?