Given the program:
import Language.Haskell.Exts.Annotated -- from haskell-src-exts
import System.Mem
import System.IO
import Control.Exception
main :: IO ()
main = do
evaluate $ length $ show $ fromParseResult $ parseFileContents $ "data C = C {a :: F {- " ++ replicate 400000 'd' ++ " -} }"
performGC
performGC
performGC
Using GHC 7.0.3, when I run:
$ ghc --make Temp.hs -rtsopts && Temp.exe +RTS -G1 -S
Alloc Copied Live GC GC TOT TOT Page Flts
bytes bytes bytes user elap user elap
...
29463264 64 8380480 0.00 0.00 0.64 0.85 0 0 (Gen: 0)
20 56 8380472 0.00 0.00 0.64 0.86 0 0 (Gen: 0)
0 56 8380472 0.00 0.00 0.64 0.87 0 0 (Gen: 0)
42256 780 33452 0.00 0.00 0.64 0.88 0 0 (Gen: 0)
0 0.00 0.00
The performGC
call seems to leave 8Mb of memory live, even though it seems like all the memory should be dead. How come?
(Without -G1
I see 10Mb live at the end, which I also can't explain.)
Here's what I see (after inserting a print
before the last performGC
, to help tag when things happen.
524288 524296 32381000 0.00 0.00 1.15 1.95 0 0 (Gen: 0)
524288 524296 31856824 0.00 0.00 1.16 1.96 0 0 (Gen: 0)
368248 808 1032992 0.00 0.02 1.16 1.99 0 0 (Gen: 1)
0 808 1032992 0.00 0.00 1.16 1.99 0 0 (Gen: 1)
"performed!"
39464 2200 1058952 0.00 0.00 1.16 1.99 0 0 (Gen: 1)
22264 1560 1075992 0.00 0.00 1.16 2.00 0 0 (Gen: 0)
0 0.00 0.00
So after GCs there is still 1M on the heap (without -G1). With -G1 I see:
34340656 20520040 20524800 0.10 0.12 0.76 0.85 0 0 (Gen: 0)
41697072 24917800 24922560 0.12 0.14 0.91 1.01 0 0 (Gen: 0)
70790776 800 2081568 0.00 0.02 1.04 1.20 0 0 (Gen: 0)
0 800 2081568 0.00 0.00 1.04 1.20 0 0 (Gen: 0)
"performed!"
39464 2184 1058952 0.00 0.00 1.05 1.21 0 0 (Gen: 0)
22264 2856 43784 0.00 0.00 1.05 1.21 0 0 (Gen: 0)
0 0.00 0.00
So about 2M. This is on x86_64/Linux.
Let's think about the STG machine storage model to see if there's something else on the heap.
Things that could be in that 1M of space:
- CAFs for things like
[]
, string constants, and the small Int
and Char
pool, plus things in libraries, the stdin
MVar?
- Thread State Objects (TSOs) for the
main
thread.
- Any allocated signal handlers.
- The IO manager Haskell code.
- Sparks in the spark pool
From experience, this figure of slightly less than 1M seems to be the default "footprint" of a GHC binary. That's about what I've seen in other programs as well (e.g. shootout program smallest footprints are never less than 900K).
Perhaps the profiler can say something. Here's the -hT
profile (no profiling libs needed), after I insert a minimal busy loop at the end to string out the tail:
$ ./A +RTS -K10M -S -hT -i0.001
Results in this graph:
Victory! Look at that ~1M thread stack object sitting there!
I don't know of a way to make TSOs smaller.
The code that produced the above graph:
import Language.Haskell.Exts.Annotated -- from haskell-src-exts
import System.Mem
import System.IO
import Data.Int
import Control.Exception
main :: IO ()
main = do
evaluate $ length $ show $ fromParseResult
$ parseFileContents
$ "data C = C {a :: F {- " ++ replicate 400000 'd' ++ " -} }"
performGC
performGC
print "performed!"
performGC
-- busy loop so we can sample what's left on the heap.
let go :: Int32 -> IO ()
go 0 = return ()
go n = go $! n-1
go (maxBound :: Int32)
Compiling the code with -O -ddump-simpl
, I see the following global definition in the simplifier output:
lvl2_r12F :: [GHC.Types.Char]
[GblId]
lvl2_r12F =
GHC.Base.unpackAppendCString# "data C = C {a :: F {- " lvl1_r12D
The input to the parser function has become a global string constant. Globals are never garbage collected in GHC, so that's probably what's occupying the 8MB of memory after garbage colleciton.