Maintaining complex state in Haskell

2019-01-29 22:11发布

问题:

Suppose you're building a fairly large simulation in Haskell. There are many different types of entities whose attributes update as the simulation progresses. Let's say, for the sake of example, that your entities are called Monkeys, Elephants, Bears, etc..

What is your preferred method for maintaining these entities' states?

The first and most obvious approach I thought of was this:

mainLoop :: [Monkey] -> [Elephant] -> [Bear] -> String
mainLoop monkeys elephants bears =
  let monkeys'   = updateMonkeys   monkeys
      elephants' = updateElephants elephants
      bears'     = updateBears     bears
  in
    if shouldExit monkeys elephants bears then "Done" else
      mainLoop monkeys' elephants' bears'

It's already ugly having each type of entity explicitly mentioned in the mainLoop function signature. You can imagine how it would get absolutely awful if you had, say, 20 types of entities. (20 is not unreasonable for complex simulations.) So I think this is an unacceptable approach. But its saving grace is that functions like updateMonkeys are very explicit in what they do: They take a list of Monkeys and return a new one.

So then the next thought would be to roll everything into one big data structure that holds all state, thus cleaning up the signature of mainLoop:

mainLoop :: GameState -> String
mainLoop gs0 =
  let gs1 = updateMonkeys   gs0
      gs2 = updateElephants gs1
      gs3 = updateBears     gs2
  in
    if shouldExit gs0 then "Done" else
      mainLoop gs3

Some would suggest that we wrap GameState up in a State Monad and call updateMonkeys etc. in a do. That's fine. Some would rather suggest we clean it up with function composition. Also fine, I think. (BTW, I'm a novice with Haskell, so maybe I'm wrong about some of this.)

But then the problem is, functions like updateMonkeys don't give you useful information from their type signature. You can't really be sure what they do. Sure, updateMonkeys is a descriptive name, but that's little consolation. When I pass in a god object and say "please update my global state," I feel like we're back in the imperative world. It feels like global variables by another name: You have a function that does something to the global state, you call it, and you hope for the best. (I suppose you still avoid some concurrency problems that would be present with global variables in an imperative program. But meh, concurrency isn't nearly the only thing wrong with global variables.)

A further problem is this: Suppose the objects need to interact. For example, we have a function like this:

stomp :: Elephant -> Monkey -> (Elephant, Monkey)
stomp elephant monkey =
  (elongateEvilGrin elephant, decrementHealth monkey)

Say this gets called in updateElephants, because that's where we check to see if any of the elephants are in stomping range of any monkeys. How do you elegantly propagate the changes to both the monkeys and elephants in this scenario? In our second example, updateElephants takes and returns a god object, so it could effect both changes. But this just muddies the waters further and reinforces my point: With the god object, you're effectively just mutating global variables. And if you're not using the god object, I'm not sure how you'd propagate those types of changes.

What to do? Surely many programs need to manage complex state, so I'm guessing there are some well-known approaches to this problem.

Just for the sake of comparison, here's how I might solve the problem in the OOP world. There would be Monkey, Elephant, etc. objects. I'd probably have class methods to do lookups in the set of all live animals. Maybe you could lookup by location, by ID, whatever. Thanks to the data structures underlying the lookup functions, they'd stay allocated on the heap. (I'm assuming GC or reference counting.) Their member variables would get mutated all the time. Any method of any class would be able to mutate any live animal of any other class. E.g. an Elephant could have a stomp method that would decrement the health of a passed-in Monkey object, and there would be no need to pass that

Likewise, in an Erlang or other actor-oriented design, you could solve these problems fairly elegantly: Each actor maintains its own loop and thus its own state, so you never need a god object. And message passing allows one object's activities to trigger changes in other objects without passing a bunch of stuff all the way back up the call stack. Yet I have heard it said that actors in Haskell are frowned upon.

回答1:

The answer is functional reactive programming (FRP). It it a hybrid of two coding styles: component state management and time-dependent values. Since FRP is actually a whole family of design patterns, I want to be more specific: I recommend Netwire.

The underlying idea is very simple: You write many small, self-contained components each with their own local state. This is practically equivalent to time-dependent values, because each time you query such a component you may get a different answer and cause a local state update. Then you combine those components to form your actual program.

While this sounds complicated and inefficient it's actually just a very thin layer around regular functions. The design pattern implemented by Netwire is inspired by AFRP (Arrowized Functional Reactive Programming). It's probably different enough to deserve its own name (WFRP?). You may want to read the tutorial.

In any case a small demo follows. Your building blocks are wires:

myWire :: WireP A B

Think of this as a component. It is a time-varying value of type B that depends on a time-varying value of type A, for example a particle in a simulator:

particle :: WireP [Particle] Particle

It depends on a list of particles (for example all currently existing particles) and is itself a particle. Let's use a predefined wire (with a simplified type):

time :: WireP a Time

This is a time-varying value of type Time (= Double). Well, it's time itself (starting at 0 counted from whenever the wire network was started). Since it doesn't depend on another time-varying value you can feed it whatever you want, hence the polymorphic input type. There are also constant wires (time-varying values that don't change over time):

pure 15 :: Wire a Integer

-- or even:
15 :: Wire a Integer

To connect two wires you simply use categorical composition:

integral_ 3 . 15

This gives you a clock at 15x real time speed (the integral of 15 over time) starting at 3 (the integration constant). Thanks to various class instances wires are very handy to combine. You can use your regular operators as well as applicative style or arrow style. Want a clock that starts at 10 and is twice the real time speed?

10 + 2*time

Want a particle that starts and (0, 0) with (0, 0) velocity and accelerates with (2, 1) per second per second?

integral_ (0, 0) . integral_ (0, 0) . pure (2, 1)

Want to display statistics while the user presses the spacebar?

stats . keyDown Spacebar <|> "stats currently disabled"

This is just a small fraction of what Netwire can do for you.



回答2:

I know this is old topic. But I am facing the same problem right now while trying to implement Rail Fence cipher exercise from exercism.io. It is quite disappointing to see such a common problem having such poor attention in Haskell. I don't take it that to do some as simple as maintaining state I need to learn FRP. So, I continued googling and found solution looking more straightforward - State monad: https://en.wikibooks.org/wiki/Haskell/Understanding_monads/State