Finite automaton in Haskell

2019-03-09 14:04发布

问题:

What is a good way to represent finite automaton in Haskell? How would the data type of it look like?

In our college, automata were defined as a 5-tuple

(Q, X, delta, q_0, F)

where Q is the set of automaton's states, X is the alphabet (is this part even necessery), delta is the transition function taking 2-tuple from (Q,X) and returning state/-s (in non-deterministic version) and F is the set of accepting/end states.

Most importantly, I'm not sure what type delta should have...

回答1:

There are two basic options:

  • An explicit function delta :: Q -> X -> Q (or [Q] as appropriate) as Sven Hager suggests.
  • A map delta :: Map (Q, X) Q e.g. using Data.Map, or if your states/alphabet can be indexed by ascending numbers Data.Array or Data.Vector.

Note that these two approaches are essentially equivalent, one can convert from the map version to a function version (this is slightly different due to an extra Maybe from the lookup call) relatively easily

delta_func q x = Data.Map.lookup (q,x) delta_map

(Or the appropriately curried version of the look-up function for whatever mapping type you are using.)


If you are constructing the automata at compile time (and so know the possible states and can have them encoded as a data type), then using the function version gives you better type safety, as the compiler can verify that you have covered all cases.

If you are constructing the automata at run time (e.g. from user input), then storing delta as a map (and possibly doing the function conversion as above) and having an appropriate input validation that guarantees correctness so that fromJust is safe (i.e. there is always an entry in the map for any possible (Q,X) tuple and so the look-up never fails (never returns Nothing)).

Non-deterministic automata work well with the map option, because a failed look-up is the same as having no state to go to, i.e. an empty [Q] list, and so there doesn't need to be any special handling of the Maybe beyond a call to join . maybeToList (join is from Data.Monad and maybeToList is from Data.Maybe).


On a different note, the alphabet is most definitely necessary: it is how the automaton receives input.



回答2:

Check out the Control.Arrow.Transformer.Automaton module in the "arrows" package. The type looks like this

newtype Automaton a b c = Automaton (a b (c, Automaton a b c))

This is a bit confusing because its an arrow transformer. In the simplest case you can write

type Auto = Automaton (->)

Which uses functions as the underlying arrow. Substituting (->) for "a" in the Automaton definition and using infix notation you can see this is roughly equivalent to:

newtype Auto b c = Automaton (b -> (c, Auto b c))

In other words an automaton is a function that takes an input and returns a result and a new automaton.

You can use this directly by writing a function for each state that takes an argument and returns the result and the next function. For instance, here is a state machine to recognise the regexp "a+b" (that is, a series of at least one 'a' followed by a 'b'). (Note: untested code)

state1, state2 :: Auto Char Bool
state1 c = if c == 'a' then (False, state2) else (False, state1)
state2 c = case c of
              'a' -> (False, state2)
              'b' -> (True, state1)
              otherwise -> (False, state1)

In terms of your original question, Q = {state1, state2}, X = Char, delta is function application, and F is the state transition returning True (rather than having an "accepting state" I've used an output transition with an accepting value).

Alternatively you can use Arrow notation. Automaton is an instance of all the interesting arrow classes, including Loop and Circuit, so you can get access to previous values by using delay. (Note: again, untested code)

recognise :: Auto Char Bool
recognise = proc c -> do
   prev <- delay 'x' -< c   -- Doesn't matter what 'x' is, as long as its not 'a'.
   returnA -< (prev == 'a' && c == 'b')

The "delay" arrow means that "prev" is equal to the previous value of "c" rather than the current value. You can also get access to your previous output by using "rec". For instance, here is an arrow that gives you a decaying total over time. (Actually tested in this case)

-- | Inputs are accumulated, but decay over time. Input is a (time, value) pair.
-- Output is a pair consisting
-- of the previous output decayed, and the current output.
decay :: (ArrowCircuit a) => NominalDiffTime -> a (UTCTime, Double) (Double, Double)
decay tau = proc (t2,v2) -> do
      rec
         (t1, v1) <- delay (t0, 0) -< (t2, v)
         let 
            dt = fromRational $ toRational $ diffUTCTime t2 t1
            v1a = v1 * exp (negate dt / tau1)
            v = v1a + v2
      returnA -< (v1a, v)
   where
      t0 = UTCTime (ModifiedJulianDay 0) (secondsToDiffTime 0)
      tau1 = fromRational $ toRational tau

Note how the input to "delay" includes "v", a value derived from its output. The "rec" clause enables this, so we can build up a feedback loop.