In Type-Safe Observable Sharing in Haskell Andy Gill shows how to recover sharing that existed on the Haskell level, in a DSL. His solution is implemented in the data-reify package. Can this approach be modified to work with GADTs? For example, given this GADT:
data Ast e where
IntLit :: Int -> Ast Int
Add :: Ast Int -> Ast Int -> Ast Int
BoolLit :: Bool -> Ast Bool
IfThenElse :: Ast Bool -> Ast e -> Ast e -> Ast e
I'd like to recover sharing by transforming the above AST to
type Name = Unique
data Ast2 e where
IntLit2 :: Int -> Ast2 Int
Add2 :: Ast2 Int -> Ast2 Int -> Ast2 Int
BoolLit2 :: Bool -> Ast2 Bool
IfThenElse2 :: Ast2 Bool -> Ast2 e -> Ast2 e -> Ast2 e
Var :: Name -> Ast2 e
by the way of a function
recoverSharing :: Ast -> (Map Name, Ast2 e1, Ast2 e2)
(I'm not sure about the type of recoverSharing
.)
Note that I don't care about introducing new bindings via a let construct, but only in recovering the sharing that existed on the Haskell level. That's why I have recoverSharing
return a Map
.
If it can't be done as reusable package, can it at least be done for specific GADT?
Interesting puzzle! It turns out you can use data-reify with GADTs. What you need is a wrapper that hides the type in an existential. The type can later be retrieved by pattern matching on the Type
data type.
data Type a where
Bool :: Type Bool
Int :: Type Int
data WrappedAst s where
Wrap :: Type e -> Ast2 e s -> WrappedAst s
instance MuRef (Ast e) where
type DeRef (Ast e) = WrappedAst
mapDeRef f e = Wrap (getType e) <$> mapDeRef' f e
where
mapDeRef' :: Applicative f => (forall b. (MuRef b, WrappedAst ~ DeRef b) => b -> f u) -> Ast e -> f (Ast2 e u)
mapDeRef' f (IntLit i) = pure $ IntLit2 i
mapDeRef' f (Add a b) = Add2 <$> (Var Int <$> f a) <*> (Var Int <$> f b)
mapDeRef' f (BoolLit b) = pure $ BoolLit2 b
mapDeRef' f (IfThenElse b t e) = IfThenElse2 <$> (Var Bool <$> f b) <*> (Var (getType t) <$> f t) <*> (Var (getType e) <$> f e)
getVar :: Map Name (WrappedAst Name) -> Type e -> Name -> Maybe (Ast2 e Name)
getVar m t n = case m ! n of Wrap t' e -> (\Refl -> e) <$> typeEq t t'
Here's the whole code: https://gist.github.com/3590197
Edit: I like the use of Typeable
in the other answer. So I did a version of my code with Typeable
too: https://gist.github.com/3593585. The code is significantly shorter. Type e ->
is replaced by Typeable e =>
, which also has a downside: we no longer know that the possible types are limited to Int
and Bool
, which means there has to be a Typeable e
constraint in IfThenElse
.
I will try show that this can be done for specific GADTs, using your GADT as an example.
I will use the Data.Reify package. This requires me to define a new data structure in which the recusive positions are replaced by a parameter.
data AstNode s where
IntLitN :: Int -> AstNode s
AddN :: s -> s -> AstNode s
BoolLitN :: Bool -> AstNode s
IfThenElseN :: TypeRep -> s -> s -> s -> AstNode s
Note that I remove a lot of type information that was available in the original GADT. For the first three constructors it is clear what the associated type was (Int, Int and Bool). For the last one I will remember the type using TypeRep (available in Data.Typeable). The instance for MuRef, required by the reify package, is shown below.
instance Typeable e => MuRef (Ast e) where
type DeRef (Ast e) = AstNode
mapDeRef f (IntLit a) = pure $ IntLitN a
mapDeRef f (Add a b) = AddN <$> f a <*> f b
mapDeRef f (BoolLit a) = pure $ BoolLitN a
mapDeRef f (IfThenElse a b c :: Ast e) =
IfThenElseN (typeOf (undefined::e)) <$> f a <*> f b <*> f c
Now we can use reifyGraph to recover sharing. However, a lot of type information was lost. Lets try to recover it. I altered your definition of Ast2 slightly:
data Ast2 e where
IntLit2 :: Int -> Ast2 Int
Add2 :: Unique -> Unique -> Ast2 Int
BoolLit2 :: Bool -> Ast2 Bool
IfThenElse2 :: Unique -> Unique -> Unique -> Ast2 e
The graph from the reify package looks like this (where e = AstNode):
data Graph e = Graph [(Unique, e Unique)] Unique
Lets make a new graph data structure where we can store Ast2 Int and Ast2 Bool separately (thus, where the type information has been recovered):
data Graph2 = Graph2 [(Unique, Ast2 Int)] [(Unique, Ast2 Bool)] Unique
deriving Show
Now we only need to find a function from Graph AstNode (the result of reifyGraph) to Graph2:
recoverTypes :: Graph AstNode -> Graph2
recoverTypes (Graph xs x) = Graph2 (catMaybes $ map (f toAst2Int) xs)
(catMaybes $ map (f toAst2Bool) xs) x where
f g (u,an) = do a2 <- g an
return (u,a2)
toAst2Int (IntLitN a) = Just $ IntLit2 a
toAst2Int (AddN a b) = Just $ Add2 a b
toAst2Int (IfThenElseN t a b c) | t == typeOf (undefined :: Int)
= Just $ IfThenElse2 a b c
toAst2Int _ = Nothing
toAst2Bool (BoolLitN a) = Just $ BoolLit2 a
toAst2Bool (IfThenElseN t a b c) | t == typeOf (undefined :: Bool)
= Just $ IfThenElse2 a b c
toAst2Bool _ = Nothing
Lets do an example:
expr = Add (IntLit 42) expr
test = do
graph <- reifyGraph expr
print graph
print $ recoverTypes graph
Prints:
let [(1,AddN 2 1),(2,IntLitN 42)] in 1
Graph2 [(1,Add2 2 1),(2,IntLit2 42)] [] 1
The first line shows us that reifyGraph has correctly recovered sharing. The second line shows us that only Ast2 Int types have been found (which is also correct).
This method is easily adaptable for other specific GADTs, but I don't see how it could be made entirely generic.
The complete code can be found at http://pastebin.com/FwQNMDbs .