This post is literate Haskell. Just put in a file like "pad.lhs" and ghci
will be able to run it.
> {-# LANGUAGE GADTs, Rank2Types #-}
> import Control.Monad
> import Control.Monad.ST
> import Data.STRef
Okay, so I was able to figure how to represent the ST
monad in pure code. First we start with our reference type. Its specific value is not really important. The most important thing is that PT s a
should not be isomorphic to any other type forall s
. (In particular, it should be isomorphic to neither ()
nor Void
.)
> newtype PTRef s a = Ref {unref :: s a} -- This is defined liked this to make `toST'` work. It may be given a different definition.
The kind for s
is *->*
, but that is not really important right now. It could be polykind, for all we care.
> data PT s a where
> MkRef :: a -> PT s (PTRef s a)
> GetRef :: PTRef s a -> PT s a
> PutRef :: a -> PTRef s a -> PT s ()
> AndThen :: PT s a -> (a -> PT s b) -> PT s b
Pretty straight forward. AndThen
allows us to use this as a Monad
. You may be wondering how return
is implemented. Here is its monad instance (it only respects monad laws with respect to runPF
, to be defined later):
> instance Monad (PT s) where
> (>>=) = AndThen
> return a = AndThen (MkRef a) GetRef --Sorry. I like minimalism.
> instance Functor (PT s) where
> fmap = liftM
> instance Applicative (PT s) where
> pure = return
> (<*>) = ap
Now we can define fib
as a test case.
> fib :: Int -> PT s Integer
> fib n = do
> rold <- MkRef 0
> rnew <- MkRef 1
> replicateM_ n $ do
> old <- GetRef rold
> new <- GetRef rnew
> PutRef new rold
> PutRef (old+new) rnew
> GetRef rold
And it type checks. Hurray! Now, I was able to convert this to ST
(we now see why s
had to be * -> *
)
> toST :: PT (STRef s) a -> ST s a
> toST (MkRef a ) = fmap Ref $ newSTRef a
> toST (GetRef (Ref r)) = readSTRef r
> toST (PutRef a (Ref r)) = writeSTRef r a
> toST (pa `AndThen` apb) = (toST pa) >>= (toST . apb)
Now we can define a function to run PT
without referencing ST
at all:
> runPF :: (forall s. PT s a) -> a
> runPF p = runST $ toST p
runPF $ fib 7
gives 13
, which is correct.
My question is can we define runPF
without reference using ST
at all?
Is there a pure way to define runPF
? PTRef
's definition is completely unimportant; it's only a placeholder type anyway. It can be redefined to whatever makes it work.
If you cannot define runPF
purely, give a proof that it cannot.
Performance is not a concern (if it was, I would not have made every return
have its own ref).
I'm thinking that existential types may be useful.
Note: It's trivial if we assume is a
is dynamicable or something. I'm looking for an answer that works with all a
.
Note: In fact, an answer does not even necessarily have much to do with PT
. It just needs to be as powerful as ST
without using magic. (Conversion from (forall s. PT s)
is sort of a test of if an answer is valid or not.)
tl;dr: It's not possible without adjustments to the definition of
PT
. Here's the core problem: you'll be running your stateful computation in the context of some sort of storage medium, but said storage medium has to know how to store arbitrary types. This isn't possible without packaging up some sort of evidence into theMkRef
constructor - either an existentially wrappedTypeable
dictionary as others have suggested, or a proof that the value belongs to one of a known finite set of types.For a first attempt, let's try using a list as the storage medium and integers to refer to elements of the list.
When storing a new item in the environment, we make sure to add it to the end of the list, so that
Ref
s we've previously given out stay pointing at the correct element.This ain't right. I can make a reference to any type
a
, but the type ofinterp
says that the storage medium is a homogeneous list ofb
s. GHC has us bang to rights when it rejects this type signature, complaining that it can't matchb
with the type of the thing insideMkRef
.Undeterred, let us have a go at using a heterogeneous list as the environment for the
State
monad in which we'll interpretPT
.This is one of my personal favourite Haskell data types. It's an extensible tuple indexed by a list of the types of the things inside it. Tuples are heterogeneous linked lists with type-level information about the types of the things inside it. (It's often called
HList
following Kiselyov's paper but I preferTuple
.) When you add something to the front of a tuple, you add its type to the front of the list of types. In a poetic mood, I once put it this way: "The tuple and its type grow together, like a vine creeping up a bamboo plant."Examples of
Tuple
s:What do references to values inside tuples look like? We have to prove to GHC that the type of the thing we're getting out of the tuple is indeed the type we expect.
The definition of
Elem
is structurally that of the natural numbers (Elem
values likeThere (There Here)
look similar to natural numbers likeS (S Z)
) but with extra types - in this case, proving that the typea
is in the type-level listas
. I mention this because it's suggestive:Nat
s make good list indices, and likewiseElem
is useful for indexing into a tuple. In this respect it'll be useful as a replacement for theInt
inside our reference type.We need a couple of functions to work with tuples and indices.
Let's try and write an interpreter for a
PT
in aTuple
environment.No can do, buster. The problem is that the type of the
Tuple
in the environment changes when we obtain a new reference. As I mentioned before, adding something to a tuple adds its type to the tuple's type, a fact belied by the typeState (Tuple as) a
. GHC's not fooled by this attempted subterfuge:Could not deduce (as ~ (as :++: '[a1]))
.This is where the wheels come off, as far as I can tell. What you really want to do is keep the size of the tuple constant throughout a
PT
computation. This would require you to indexPT
itself by the list of types to which you can obtain references, proving every time you do so that you're allowed to (by giving anElem
value). The environment would then look like a tuple of lists, and a reference would consist anElem
(to select the right list) and anInt
(to find the particular item in the list).This plan breaks the rules, of course (you need to change the definition of
PT
), but it also has engineering problems. When I callMkRef
, the onus is on me to give anElem
for the value I'm making a reference to, which is pretty tedious. (That said, you can usually convince GHC to findElem
values by proof search using a hacky type class.)Another thing: composing
PT
s becomes difficult. All the parts of your computation have to be indexed by the same list of types. You could attempt to introduce combinators or classes which allow you to grow the environment of aPT
, but you'd also have to update all the references when you do that. Using the monad would be quite difficult.A possibly-cleaner implementation would allow the list of types in a
PT
to vary as you walk around the datatype: every time you encounter aMkRef
the type gets one longer. Because the type of the computation changes as it progresses, you can't use a regular monad - you have to resort toIxMonad
. If you want to know what that program looks like, see my other answer.Ultimately, the sticking point is that the type of the tuple is determined by the value of the
PT
request. The environment is what a given request decides to store in it.interp
doesn't get to choose what's in the tuple, it must come from an index onPT
. Any attempt to cheat that requirement is going to crash and burn. Now, in a true dependently-typed system we could examine thePT
value we were given and figure out whatas
should be. Alas, Haskell is not a dependently-typed system.Since I posted my earlier answer, you've indicated that you don't mind making changes to your definition of
PT
. I am happy to report: relaxing that restriction changes the answer to your question from no to yes! I've already argued that you need to index your monad by the set of types in your storage medium, so here's some working code showing how to do that. (I originally had this as an edit to my previous answer but it got too long, so here we are.)We're going to need a smarter
Monad
class than the one in the Prelude: that of indexed monad-like things describing paths through a directed graph. For reasons that should become apparent, I'm going to define indexed functors as well.An indexed monad uses the type system to track the progress of a stateful computation.
m i j a
is a monadic computation which requires an input state ofi
, changes the state toj
, and produces a value of typea
. Sequencing indexed monads with>>>=
is like playing dominoes. You can feed a computation which takes the state fromi
toj
into a computation which goes fromj
tok
, and get a bigger computation fromi
tok
. (There's a richer version of this indexed monad described in Kleisli Arrows of Outrageous Fortune (and elsewhere) but this one is quite enough for our purposes.)One possibility with
MonadIx
is aFile
monad which tracks the state of a file handle, ensuring you don't forget to free resources.fOpen :: File Closed Open ()
starts with a closed file and opens it,fRead :: File Open Open String
returns the contents of an opened file, andfClose :: File Open Closed ()
takes a file from open to closed. Therun
operation takes a computation of typeFile Closed Closed a
, which ensures that your file handles always get cleaned up.But I digress: here we are concerned not with a file handle but with a set of typed "memory locations"; the types of the things in the virtual machine's memory bank are what we'll use for the monad's indices. I like to get my "program/interpreter" monads for free because it expresses the fact that results live at the leaves of a computation, and promotes composability and code reuse, so here's the functor which will produce
PT
when we plug it intoFreeIx
below:PTF
is parameterised by the type of referenceref :: [*] -> * -> *
- references are allowed to know which types are in the system - and indexed by the list of types being stored in the interpreter's "memory". The interesting case isMkRef_
: making a new reference adds a value of typea
to the memory, takingas
toa ': as
; the continuation expects aref
in the extended environment. The other operations don't change the list of types in the system.When I create references sequentially (
x <- mkRef 1; y <- mkRef 2
), they'll have different types: the first will be aref (a ': as) a
and the second will be aref (b ': a ': as) b
. To make the types line up, I need a way to use a reference in a bigger environment than the one it was created in. In general, this operation depends on the type of reference, so I'll put it in a class.One possible generalisation of this class would wrap up the pattern of repeated applications of
expand
, with a type likeinflate :: ref as a -> ref (bs :++: as) a
.Here's another reusable bit of infrastructure, the indexed free monad I mentioned earlier.
FreeIx
turns an indexed functor into an indexed monad by providing a type-aligned joining operationFree
, which ties the recursive knot in the functor's parameter, and a do-nothing operationPure
.One disadvantage of free monads is the boilerplate you have to write to make
Free
andPure
easier to work with. Here are some single-actionPT
s which form the basis of the monad's API, and some pattern synonyms to hide theFree
constructors when we unpackPT
values.That's everything we need to be able to write
PT
computations. Here's yourfib
example. I'm usingRebindableSyntax
and locally redefining the monad operators (to their indexed equivalents) so I can usedo
notation on my indexed monad.This version of
fib
looks just like the one you wanted to write in the original question. The only difference (apart from the local bindings of>>=
and so on) is the call toexpand
. Every time you create a new reference, you have toexpand
all the old ones, which is a bit tedious.Finally we can finish the job we set out to do and build a
PT
-machine which uses aTuple
as the storage medium andElem
as the reference type.To use an
Elem
in a larger tuple than the one you built it for, you just need to make it look further down the list.Note that this deployment of
Elem
is rather like a de Bruijn index: more-recently-bound variables have smaller indices.When the interpreter encounters a
MkRef
request, it increases the size of its memory by addingx
to the front. The type checker will remind you that anyref
s from before theMkRef
must be correctlyexpand
ed, so existing references don't get out of whack when the tuple changes size. We paid for an interpreter without unsafe casts, but we got referential integrity to boot.Running from a standing start requires that the
PT
computation expects to begin with an empty memory bank, but we allow it to end in any state.It typechecks, but does it work?
A simple solution is to wrap a
State
monad and present the same API asST
. In this case there's no need to store runtime type information, since it can be determined from the type ofSTRef
-s, and the usualST s
quantification trick lets us prevent users from messing up the container storing the references.We keep ref-s in an
IntMap
and increment a counter each time we allocate a new ref. Reading and writing just modifies theIntMap
with someunsafeCoerce
sprinkled atop.Now we can do:
But the following fails with the usual
ST
type error:Of course, the above scheme never garbage collects references, instead it frees up everything on each
runST
call. I think a more complex system could implement multiple distinct regions, each tagged by a type parameter, and allocate/free resources in a more fine-grained manner.Also, the use of
unsafeCoerce
means here that using internals directly is every bit as dangerous as usingGHC.ST
internals andState#
directly, so we should make sure to present a safe API, and also test our internals thoroughly (or else we may get segfaults in Haskell, a great sin).