Basic I/O performance in Haskell

2019-03-31 03:01发布

Another microbenchmark: Why is this "loop" (compiled with ghc -O2 -fllvm, 7.4.1, Linux 64bit 3.2 kernel, redirected to /dev/null)

mapM_ print [1..100000000]

about 5x slower than a simple for-cycle in plain C with write(2) non-buffered syscall? I am trying to gather Haskell gotchas.

Even this slow C solution is much faster than Haskell

int i;
char buf[16];
for (i=0; i<=100000000; i++) {
    sprintf(buf, "%d\n", i);
    write(1, buf, strlen(buf));
}

标签: haskell ghc
3条回答
Deceive 欺骗
2楼-- · 2019-03-31 03:47

Okay, on my box the C code, compiled per gcc -O3 takes about 21.5 seconds to run, the original Haskell code about 56 seconds. So not a factor of 5, a bit above 2.5.

The first nontrivial difference is that

mapM_ print [1..100000000]

uses Integers, that's a bit slower because it involves a check upfront, and then works with boxed Ints, while the Show instance of Int does the conversion work on unboxed Int#s.

Adding a type signature, so that the Haskell code works on Ints,

mapM_ print [1 :: Int .. 100000000]

brings the time down to 47 seconds, a bit above twice the time the C code takes.

Now, another big difference is that show produces a linked list of Char and doesn't just fill a contiguous buffer of bytes. That is slower too.

Then that linked list of Chars is used to fill a byte buffer that then is written to the stdout handle.

So, the Haskell code does more, and more complicated things than the C code, thus it's not surprising that it takes longer.

Admittedly, it would be desirable to have an easy way to output such things more directly (and hence faster). However, the proper way to handle it is to use a more suitable algorithm (that applies to C too). A simple change to

putStr . unlines $ map show [0 :: Int .. 100000000]

almost halves the time taken, and if one wants it really fast, one uses the faster ByteString I/O and builds the output efficiently as exemplified in applicative's answer.

查看更多
Summer. ? 凉城
3楼-- · 2019-03-31 03:53

On my (rather slow and outdated) machine the results are:

$ time haskell-test > haskell-out.txt
real    1m57.497s
user    1m47.759s
sys     0m9.369s
$ time c-test > c-out.txt
real    7m28.792s
user    1m9.072s
sys     6m13.923s
$ diff haskell-out.txt c-out.txt
$

(I have fixed the list so that both C and Haskell start with 0).

Yes you read this right. Haskell is several times faster than C. Or rather, normally buffered Haskell is faster than C with write(2) non-buffered syscall.

(When measuring output to /dev/null instead of a real disk file, C is about 1.5 times faster, but who cares about /dev/null performance?)

Technical data: Intel E2140 CPU, 2 cores, 1.6 GHz, 1M cache, Gentoo Linux, gcc4.6.1, ghc7.6.1.

查看更多
贼婆χ
4楼-- · 2019-03-31 04:04

The standard Haskell way to hand giant bytestrings over to the operating system is to use a builder monoid.

import Data.ByteString.Lazy.Builder  -- requires bytestring-0.10.x
import Data.ByteString.Lazy.Builder.ASCII -- omit for bytestring-0.10.2.x
import Data.Monoid
import System.IO

main = hPutBuilder stdout $ build  [0..100000000::Int]

build = foldr add_line mempty
   where add_line n b = intDec n <> charUtf8 '\n' <> b

which gives me:

 $ time ./printbuilder >> /dev/null
 real   0m7.032s
 user   0m6.603s
 sys    0m0.398s

in contrast to Haskell approach you used

$ time ./print >> /dev/null
real    1m0.143s
user    0m58.349s
sys 0m1.032s

That is, it's child's play to do nine times better than mapM_ print, contra Daniel Fischer's suprising defeatism. Everything you need to know is here: http://hackage.haskell.org/packages/archive/bytestring/0.10.2.0/doc/html/Data-ByteString-Builder.html I won't compare it with your C since my results were much slower than Daniel's and n.m. so I figure something was going wrong.

Edit: Made the imports consistent with all versions of bytestring-0.10.x It occurred to me the following might be clearer -- the Builder equivalent of unlines . map show:

main = hPutBuilder stdout $ unlines_ $ map intDec [0..100000000::Int]
 where unlines_ = mconcat . map (<> charUtf8 '\n')
查看更多
登录 后发表回答