How can I recover sharing in a GADT?

2019-03-15 00:30发布

问题:

In Type-Safe Observable Sharing in Haskell Andy Gill shows how to recover sharing that existed on the Haskell level, in a DSL. His solution is implemented in the data-reify package. Can this approach be modified to work with GADTs? For example, given this GADT:

data Ast e where
  IntLit :: Int -> Ast Int
  Add :: Ast Int -> Ast Int -> Ast Int
  BoolLit :: Bool -> Ast Bool
  IfThenElse :: Ast Bool -> Ast e -> Ast e -> Ast e

I'd like to recover sharing by transforming the above AST to

type Name = Unique

data Ast2 e where
  IntLit2 :: Int -> Ast2 Int
  Add2 :: Ast2 Int -> Ast2 Int -> Ast2 Int
  BoolLit2 :: Bool -> Ast2 Bool
  IfThenElse2 :: Ast2 Bool -> Ast2 e -> Ast2 e -> Ast2 e
  Var :: Name -> Ast2 e

by the way of a function

recoverSharing :: Ast -> (Map Name, Ast2 e1, Ast2 e2)

(I'm not sure about the type of recoverSharing.)

Note that I don't care about introducing new bindings via a let construct, but only in recovering the sharing that existed on the Haskell level. That's why I have recoverSharing return a Map.

If it can't be done as reusable package, can it at least be done for specific GADT?

回答1:

Interesting puzzle! It turns out you can use data-reify with GADTs. What you need is a wrapper that hides the type in an existential. The type can later be retrieved by pattern matching on the Type data type.

data Type a where
  Bool :: Type Bool
  Int :: Type Int

data WrappedAst s where
  Wrap :: Type e -> Ast2 e s -> WrappedAst s

instance MuRef (Ast e) where
  type DeRef (Ast e) = WrappedAst
  mapDeRef f e = Wrap (getType e) <$> mapDeRef' f e
    where
      mapDeRef' :: Applicative f => (forall b. (MuRef b, WrappedAst ~ DeRef b) => b -> f u) -> Ast e -> f (Ast2 e u)
      mapDeRef' f (IntLit i) = pure $ IntLit2 i
      mapDeRef' f (Add a b) = Add2 <$> (Var Int <$> f a) <*> (Var Int <$> f b)
      mapDeRef' f (BoolLit b) = pure $ BoolLit2 b
      mapDeRef' f (IfThenElse b t e) = IfThenElse2 <$> (Var Bool <$> f b) <*> (Var (getType t) <$> f t) <*> (Var (getType e) <$> f e)

getVar :: Map Name (WrappedAst Name) -> Type e -> Name -> Maybe (Ast2 e Name)
getVar m t n = case m ! n of Wrap t' e -> (\Refl -> e) <$> typeEq t t'

Here's the whole code: https://gist.github.com/3590197

Edit: I like the use of Typeable in the other answer. So I did a version of my code with Typeable too: https://gist.github.com/3593585. The code is significantly shorter. Type e -> is replaced by Typeable e =>, which also has a downside: we no longer know that the possible types are limited to Int and Bool, which means there has to be a Typeable e constraint in IfThenElse.



回答2:

I will try show that this can be done for specific GADTs, using your GADT as an example.

I will use the Data.Reify package. This requires me to define a new data structure in which the recusive positions are replaced by a parameter.

data AstNode s where
  IntLitN :: Int -> AstNode s
  AddN :: s -> s -> AstNode s
  BoolLitN :: Bool -> AstNode s
  IfThenElseN :: TypeRep -> s -> s -> s -> AstNode s

Note that I remove a lot of type information that was available in the original GADT. For the first three constructors it is clear what the associated type was (Int, Int and Bool). For the last one I will remember the type using TypeRep (available in Data.Typeable). The instance for MuRef, required by the reify package, is shown below.

instance Typeable e => MuRef (Ast e) where
  type DeRef (Ast e)     = AstNode
  mapDeRef f (IntLit a)  = pure $ IntLitN a
  mapDeRef f (Add a b)   = AddN <$> f a <*> f b
  mapDeRef f (BoolLit a) = pure $ BoolLitN a
  mapDeRef f (IfThenElse a b c :: Ast e) = 
    IfThenElseN (typeOf (undefined::e)) <$> f a <*> f b <*> f c

Now we can use reifyGraph to recover sharing. However, a lot of type information was lost. Lets try to recover it. I altered your definition of Ast2 slightly:

data Ast2 e where
  IntLit2 :: Int -> Ast2 Int
  Add2 :: Unique -> Unique -> Ast2 Int
  BoolLit2 :: Bool -> Ast2 Bool
  IfThenElse2 :: Unique -> Unique -> Unique -> Ast2 e

The graph from the reify package looks like this (where e = AstNode):

data Graph e = Graph [(Unique, e Unique)] Unique    

Lets make a new graph data structure where we can store Ast2 Int and Ast2 Bool separately (thus, where the type information has been recovered):

data Graph2 = Graph2 [(Unique, Ast2 Int)] [(Unique, Ast2 Bool)] Unique 
            deriving Show

Now we only need to find a function from Graph AstNode (the result of reifyGraph) to Graph2:

recoverTypes :: Graph AstNode -> Graph2
recoverTypes (Graph xs x) = Graph2 (catMaybes $ map (f toAst2Int) xs) 
                                   (catMaybes $ map (f toAst2Bool) xs) x where
  f g (u,an) = do a2 <- g an
                  return (u,a2)

  toAst2Int (IntLitN a) = Just $ IntLit2 a
  toAst2Int (AddN a b)  = Just $ Add2 a b
  toAst2Int (IfThenElseN t a b c) | t == typeOf (undefined :: Int) 
                        = Just $ IfThenElse2 a b c
  toAst2Int _           = Nothing

  toAst2Bool (BoolLitN a) = Just $ BoolLit2 a
  toAst2Bool (IfThenElseN t a b c) | t == typeOf (undefined :: Bool) 
                          = Just $ IfThenElse2 a b c
  toAst2Bool _            = Nothing

Lets do an example:

expr = Add (IntLit 42) expr  

test = do
  graph <- reifyGraph expr
  print graph
  print $ recoverTypes graph

Prints:

let [(1,AddN 2 1),(2,IntLitN 42)] in 1
Graph2 [(1,Add2 2 1),(2,IntLit2 42)] [] 1

The first line shows us that reifyGraph has correctly recovered sharing. The second line shows us that only Ast2 Int types have been found (which is also correct).

This method is easily adaptable for other specific GADTs, but I don't see how it could be made entirely generic.

The complete code can be found at http://pastebin.com/FwQNMDbs .



标签: haskell dsl gadt