I am currently writting a genetic algorithm in Haskell in which my chromosomes are rather complex structures representing executable systems.
In order for me to evaluate the fitness of my chromosomes I have to run an evolution
function which performs one computational cycle of a given system. The fitness then is calculated just by counting how many times the evolution
can be applied before there is no change in the system (in which case the system terminates).
The problem now is as follows: some systems can run infinitely long and will never terminate - I want to penalise those (by giving them little score). I could simply put a certain limit on number of steps but it does not solve another problem.
Some of my systems perform exponential computation (i.e. even for small values of evloution steps they grow to giant size) and they cause ERROR - Control stack overflow
. For human observer it is clear that they will never terminate but the algorithm has no way of knowing so it runs and crushes.
My question is: is it possible to recover from such an error? I would like my algorithm to continue running after encountering this problem and just adjusting the chromosome score accordingly.
It seems to me like the best solution would be to tell the program: "Hey, try doing this, but if you fail don't worry. I know how to handle it". However I am not even sure if that's possible. If not - are there any alternatives?
This will be hard to do reliably from inside Haskell -- though under some conditions GHC will raise exceptions for these conditions. (You will need GHC 7).
import Control.Exception
If you really just want to catch stack overflows, this is possible, as this example shows:
> handle (\StackOverflow -> return Nothing) $
return . Just $! foldr (+) 0 (replicate (2^25) 1)
Nothing
Or catching any async exception (including heap exhaustion):
> handle (\(e :: AsyncException) -> print e >> return Nothing) $
return . Just $! foldr (+) 0 (replicate (2^25) 1)
stack overflow
Nothing
However, this is fragile.
Alternately, with GHC flags you can enforce maximum stack (or heap) size on a GHC-compiled process, causing it to be killed if it exceeds those limits (GHC appears to have no maximum stack limit these days).
If you compile your Haskell program with GHC (as is recommended), running it as:
$ ghc -O --make A.hs -rtsopts
the low heap limit below is enforced:
$ ./A +RTS -M1M -K1M
Heap exhausted;
This requires GHC. (Again, you shouldn't be using Hugs for this kind of work). Finally, you should ensure your programs don't use excessive stack in the first place, via profiling in GHC.
I think a general solution here is to provide a way to measure computation time, and kill it if it takes too much time. You can simply add counter to your evaluation function if it's recursive and if it drops to zero you return an error value - for example Nothing
, otherwise it's Just result
.
This approach can be implemented in other ways than explicit count parameter, for example by putting this counter into monad used by evaluation (if your code is monadic) or, impurely, by running computation in separate thread that will be killed on timeout.
I would rather use any pure solution, since it would be more reliable.
It seems to me like the best solution would be to tell the program:
"Hey, try doing this, but if you fail don't worry. I know how to handle it"
In most languages that would be a try/catch
block. I'm not sure what the equivalent is in haskell, or even if some equivalent exists. Furthermore, I doubt that a try/catch
construct could effectively trap/handle a stack overflow condition.
But would it be possible to apply some reasonable constraints to prevent the overflow from occurring? For instance, perhaps you could set some upper-bound on the size of a system, and monitor how each system approaches the boundary from one iteration to the next. Then you could enforce a rule like "if on a single evolution
a system either exceeded its upper-bound or consumed more than 50% of the space remaining between its previous allocation and its upper-bound, that system is terminated and suffers a score penalty".
A thought on your genetic algorithm: Part of the fitness of your chromosomes is that they do not consume too many computational resources. The question you asked defines "too many resources" as crashing the runtime system. This is a rather arbitrary and somewhat random measure.
Knowing that it will add to the complexity of your evolve
function, I still would suggest that this function be made aware of the computational resources that a chromosome consumes. This allows you to fine tune when it has "eaten" too much and dies prematurely of "starvation". It might also allow you to adjust your penalty based on how rapidly the chromosome went exponential with the idea that a chromosome that is just barely exponential is more fit then one with an extremely high branching factor.