I am currently working on a simple interpreter for a programming language and I have a data type like this:
data Expr
= Variable String
| Number Int
| Add [Expr]
| Sub Expr Expr
And I have many functions that do simple things like:
-- Substitute a value for a variable
substituteName :: String -> Int -> Expr -> Expr
substituteName name newValue = go
where
go (Variable x)
| x == name = Number newValue
go (Add xs) =
Add $ map go xs
go (Sub x y) =
Sub (go x) (go y)
go other = other
-- Replace subtraction with a constant with addition by a negative number
replaceSubWithAdd :: Expr -> Expr
replaceSubWithAdd = go
where
go (Sub x (Number y)) =
Add [go x, Number (-y)]
go (Add xs) =
Add $ map go xs
go (Sub x y) =
Sub (go x) (go y)
go other = other
But in each of these functions, I have to repeat the part that calls the code recursively with just a small change to one part of the function. Is there any existing way to do this more generically? I would rather not have to copy and paste this part:
go (Add xs) =
Add $ map go xs
go (Sub x y) =
Sub (go x) (go y)
go other = other
And just change a single case each time because it seems inefficient to duplicate code like this.
The only solution I could come up with is to have a function that calls a function first on the whole data structure and then recursively on the result like this:
recurseAfter :: (Expr -> Expr) -> Expr -> Expr
recurseAfter f x =
case f x of
Add xs ->
Add $ map (recurseAfter f) xs
Sub x y ->
Sub (recurseAfter f x) (recurseAfter f y)
other -> other
substituteName :: String -> Int -> Expr -> Expr
substituteName name newValue =
recurseAfter $ \case
Variable x
| x == name -> Number newValue
other -> other
replaceSubWithAdd :: Expr -> Expr
replaceSubWithAdd =
recurseAfter $ \case
Sub x (Number y) ->
Add [x, Number (-y)]
other -> other
But I feel like there should probably be a simpler way to do this already. Am I missing something?
As an alternative approach, this is also a typical use case for the
uniplate
package. It can useData.Data
generics rather than Template Haskell to generate the boilerplate, so if you deriveData
instances for yourExpr
:then the
transform
function fromData.Generics.Uniplate.Data
applies a function recursively to each nestedExpr
:Note that in
replaceSubWithAdd
in particular, the functionf
is written to perform a non-recursive substitution;transform
makes it recursive inx :: Expr
, so it's doing the same magic to the helper function asana
does in @chi's answer:This is no shorter than @chi's Template Haskell solution. One potential advantage is that
uniplate
provides some additional functions that may be helpful. For example, if you usedescend
in place oftransform
, it transforms only the immediate children which can give you control over where the recursion happens, or you can userewrite
to re-transform the result of transformations until you reach a fixed point. One potential disadvantage is that "anamorphism" sounds way cooler than "uniplate".Full program:
Congratulations, you just rediscovered anamorphisms!
Here's your code, rephrased so that it works with the
recursion-schemes
package. Alas, it's not shorter, since we need some boilerplate to make the machinery work. (There might be some automagic way to avoid the boilerplate, e.g. using generics. I simply do not know.)Below, your
recurseAfter
is replaced with the standardana
.We first define your recursive type, as well as the functor it is the fixed point of.
Then we connect the two with a few instances so that we can unfold
Expr
into the isomorphicExprF Expr
, and fold it back.Finally, we adapt your original code, and add a couple of tests.
An alternative could be to define
ExprF a
only, and then derivetype Expr = Fix ExprF
. This saves some of the boilerplate above (e.g. the two instances), at the cost of having to useFix (VariableF ...)
instead ofVariable ...
, as well as the analogous for the other constructors.One could further alleviate that using pattern synonyms (at the cost of a little more boilerplate, though).
Update: I finally found the automagic tool, using template Haskell. This makes the whole code reasonably short. Note that the
ExprF
functor and the two instances above still exist under the hood, and we still have to use them. We only save the hassle of having to define them manually, but that alone saves a lot of effort.