Why wrapping the Data.Binary.Put monad creates a m

2019-01-26 06:30发布

问题:

I'm trying to wrap the Data.Binary.Put monad into another so that later I can ask it questions like "how many bytes it's going to write" or "what is the current position in file". But even very trivial wraps like:

data Writer1M a = Writer1M { write :: P.PutM a }
or
data Writer2M a = Writer2M { write :: (a, P.Put) }

create a huge space leak and the program usually crashes (after taking up 4GB of RAM). Here is what I've tried so far:

-- This works well and consumes almost no memory.

type Writer = P.Put

writer :: P.Put -> Writer
writer put = put

writeToFile :: String -> Writer -> IO ()
writeToFile path writer = BL.writeFile path (P.runPut writer)

-- This one will cause memory leak.

data Writer1M a = Writer1M { write :: P.PutM a }

instance Monad Writer1M where
  return a = Writer1M $ return a
  ma >>= f = Writer1M $ (write ma) >>= \a -> write $ f a

type WriterM = Writer1M
type Writer = WriterM ()

writer :: P.Put -> Writer
writer put = Writer1M $ put

writeToFile :: String -> Writer -> IO ()
writeToFile path writer = BL.writeFile path (P.runPut $ write writer)
-- This one will crash as well with exactly the
-- same memory foot print as Writer1M

data Writer2M a = Writer2M { write :: (a, P.Put) }

instance Monad Writer2M where
  return a = Writer2M $ (a, return ())
  ma >>= f = Writer2M $ (b, p >> p')
                        where (a,p) = write ma
                              (b,p') = write $ f a

type WriterM = Writer2M
type Writer = WriterM ()

writer :: P.Put -> Writer
writer put = Writer2M $ ((), put)

writeToFile :: String -> Writer -> IO ()
writeToFile path writer = BL.writeFile path (P.runPut $ snd $ write writer)

I'm new to Haskell and this makes no sence to me, but the wrapper monads seem very trivial so I'm guessing there is something obvious I'm missing.

Thanks for looking.

UPDATE: Here is a sample code that demonstrates the problem: http://hpaste.org/43400/why_wrapping_the_databinaryp

UPDATE2: There is also a second part to this question here.

回答1:

After poking around for a bit, I found that the problem seems to be the usage of binary's (>>=) to implement (>>). The following addition to the Writer1M monad implementation solves the problem:

  m >> k = Writer1M $ write m >> write k

Whereas this version still leaks memory:

  m >> k = Writer1M $ write m >>= const (write k)

Looking at binary's source, (>>) seems to discard the result of the first monad explicitly. Not sure how exactly this prevents the leak, though. My best theory is that GHC otherwise holds onto the PairS object, and the "a" reference leaks because it never gets looked at.



回答2:

Did you tried to make the monad more strict? Eg. try to make the constructors of your datatyp strict / replace them with a newtype.

I don't know what's the exact problem here, but this is the usual source of leaks.

PS: And try to remove unnecessary lambdas, for instance:

  ma >>= f = Writer1M $ (write ma) >=> write . f