There is GenProg (http://hackage.haskell.org/package/genprog) for example, but that only deals with numerical optimization, in this case finding an equation that describes the data.
But I require for loops, if statements, when statements, Boolean checks etc. I need to be able to generate imperative structures. Any thought on this? My best options so far seem to be husk-scheme where I can run Scheme code as a DSL in Haskell. Surely there must be better ways?
If you're looking for something akin to S-expressions, this is fairly easily modeled in Haskell. Say, for example, we want to represent simple algebraic equations with variables, such as
x + 5 / (y * 2 - z)
This can be represented by a simple AST in Haskell, in particular we can implement it as
data Expr
= Lit Double -- Literal numbers
| Var Char -- Variables have single letter names
| Add Expr Expr -- We can add things together
| Sub Expr Expr -- And subtract them
| Mul Expr Expr -- Why not multiply, too?
| Div Expr Expr -- And divide
deriving (Eq)
This would let us express the equation above as
Add (Var 'x') (Div (Lit 5) (Sub (Mul (Var 'y') (Lit 2)) (Var 'z')))
But this is rather clunky to write and difficult to read. Let's start by working some Show
magic so that it gets pretty printed:
instance Show Expr where
showsPrec n (Lit x) = showParen (n > 10) $ showsPrec 11 x
showsPrec n (Var x) = showParen (n > 10) $ showChar x
showsPrec n (Add x y) = showParen (n > 6) $ showsPrec 7 x . showString " + " . showsPrec 7 y
showsPrec n (Sub x y) = showParen (n > 6) $ showsPrec 7 x . showString " - " . showsPrec 7 y
showsPrec n (Mul x y) = showParen (n > 7) $ showsPrec 8 x . showString " * " . showsPrec 8 y
showsPrec n (Div x y) = showParen (n > 7) $ showsPrec 8 x . showString " / " . showsPrec 8 y
If you don't understand everything going on here, that's ok, it's a lot of complication made easy by some built in functions for efficiently building strings with parentheses in them properly and all that fun stuff. It's pretty much copied out of the docs in Text.Show
. Now if we print out our expression from above, it'll look like
x + 5.0 / (y * 2.0 - z)
Now for simplifying building these expressions. Since they're pretty much numeric, we can implement Num
(except for abs
and signum
) and Fractional
:
instance Num Expr where
fromInteger = Lit . fromInteger
(+) = Add
(-) = Sub
(*) = Mul
abs = undefined
signum = undefined
instance Fractional Expr where
(/) = Div
fromRational = Lit . fromRational
Now we can input out expression from above as
Var 'x' + 5 / (Var 'y' * 2 - Var 'z')
This is at least much easier to visually parse, even if we have to specify the variables manually.
Now that we have pretty input and output, let's focus on evaluating these expressions. Since we have variables in here, we'll need some sort of environment that associates a variable with a value
import Data.Map (Map)
import qualified Data.Map as M
type Env = Map Char Double
And now it's just basic pattern matching (along with a helper function)
import Control.Applicative
binOp :: (Double -> Double -> Double) -> Expr -> Expr -> Env -> Maybe Double
binOp op x y vars = op <$> evalExpr x vars <*> evalExpr y vars
evalExpr :: Expr -> Env -> Maybe Double
evalExpr (Lit x) = const $ Just x
evalExpr (Var x) = M.lookup x
evalExpr (Add x y) = binOp (+) x y
evalExpr (Sub x y) = binOp (-) x y
evalExpr (Mul x y) = binOp (*) x y
evalExpr (Div x y) = binOp (/) x y
Now we can use evalExpr
to evaluate an expression in our mini-language with variable substitution. All the error handling if there's an undefined variable is done by the Maybe
monad, and we were even able to cut down on repetition by making the environment argument implicit. This is all pretty standard for a simple expression DSL.
So for the fun bit, generating random expressions and (eventually) mutations. For this, we'll need System.Random
. We want to be able to generate expressions of varying complexity, so we'll express it in rough depth. Since it's random, there is a chance that we'll get shorter or deeper trees than specified. This will probably be something that you'll want to tweak and tune to get your desired results. First, because I have the foresight of having written this code already, let's define two helpers for generating a random literal and a random variable:
randomLit, randomVar :: IO Expr
randomLit = Lit <$> randomRIO (-100, 100)
randomVar = Var <$> randomRIO ('x', 'z')
Nothing exciting here, we get doubles between -100 and 100, and up to 3 variables. Now we can generate large expression trees.
generateExpr :: Int -> IO Expr
-- When the depth is 1, return a literal or a variable randomly
generateExpr 1 = do
isLit <- randomIO
if isLit
then randomLit
else randomVar
-- Otherwise, generate a tree using helper
generateExpr n = randomRIO (0, 100) >>= helper
where
helper :: Int -> IO Expr
helper prob
-- 20% chance that it's a literal
| prob < 20 = randomLit
-- 10% chance that it's a variable
| prob < 30 = randomVar
-- 15% chance of Add
| prob < 45 = (+) <$> generateExpr (n - 1) <*> generateExpr (n - 1)
-- 15% chance of Sub
| prob < 60 = (-) <$> generateExpr (n - 1) <*> generateExpr (n - 1)
-- 15% chance of Mul
| prob < 75 = (*) <$> generateExpr (n - 1) <*> generateExpr (n - 1)
-- 15% chance of Div
| prob < 90 = (/) <$> generateExpr (n - 1) <*> generateExpr (n - 1)
-- 10% chance that we generate a possibly taller tree
| otherwise = generateExpr (n + 1)
The bulk of this function is just specifying the probabilities that a given node will be generated, and then generating the left and right nodes for each operator. We even got to use the normal arithmetic operators since we overloaded Num
, how handy!
Now, remember that we can still pattern match on the constructors of this Expr
type for other operations, such as replacing nodes, swapping them, or measuring the depth. For this, you just have to treat it as any other binary tree type in Haskell, except it has 2 leaf constructors and 4 node constructors. As for mutations, you'll have to write code that traverses this structure and chooses when to swap out nodes and what to swap them out with. It'll live within the IO
monad since you'll be generating random values, but it shouldn't be too difficult. This structure should be pretty easy to extend as need be, such as if you wanted to add trig functions and exponentiation, you'd just need more constructors, more expressions in evalExpr
, and the appropriate clauses in helper
, along with some probability adjustments.
You can get the full code for this example here. Hope this helps you see how to formulate something like S-expressions in Haskell.