Converting non-negative Integer
to its list of digits is commonly done like this:
import Data.Char
digits :: Integer -> [Int]
digits = (map digitToInt) . show
I was trying to find a more direct way to perform the task, without involving a string conversion, but I'm unable to come up with something faster.
Things I've been trying so far:
The baseline:
digits :: Int -> [Int]
digits = (map digitToInt) . show
Got this one from another question on StackOverflow:
digits2 :: Int -> [Int]
digits2 = map (`mod` 10) . reverse . takeWhile (> 0) . iterate (`div` 10)
Trying to roll my own:
digits3 :: Int -> [Int]
digits3 = reverse . revDigits3
revDigits3 :: Int -> [Int]
revDigits3 n = case divMod n 10 of
(0, digit) -> [digit]
(rest, digit) -> digit:(revDigits3 rest)
This one was inspired by showInt
in Numeric
:
digits4 n0 = go n0 [] where
go n cs
| n < 10 = n:cs
| otherwise = go q (r:cs)
where
(q,r) = n `quotRem` 10
Now the benchmark. Note: I'm forcing the evaluation using filter
.
λ>:set +s
λ>length $ filter (>5) $ concat $ map (digits) [1..1000000]
2400000
(1.58 secs, 771212628 bytes)
This is the reference. Now for digits2
:
λ>length $ filter (>5) $ concat $ map (digits2) [1..1000000]
2400000
(5.47 secs, 1256170448 bytes)
That's 3.46 times longer.
λ>length $ filter (>5) $ concat $ map (digits3) [1..1000000]
2400000
(7.74 secs, 1365486528 bytes)
digits3
is 4.89 time slower. Just for fun, I tried using only revDigits3 and avoid the reverse
.
λ>length $ filter (>5) $ concat $ map (revDigits3) [1..1000000]
2400000
(8.28 secs, 1277538760 bytes)
Strangely, this is even slower, 5.24 times slower.
And the last one:
λ>length $ filter (>5) $ concat $ map (digits4) [1..1000000]
2400000
(16.48 secs, 1779445968 bytes)
This is 10.43 time slower.
I was under the impression that only using arithmetic and cons would outperform anything involving a string conversion. Apparently, there something I can't grasp.
So what the trick? Why is digits
so fast?
I'm using GHC 6.12.3.
Answering the question "why rem instead of mod?" in the comments. When dealing with positive values
rem x y === mod x y
so the only consideration of note is performance:So what is the performance? Unless you have a good reason not to (and being lazy isn't a good reason, neither is not knowing Criterion) then use a good benchmark tool, I used Criterion:
Running this shows
rem
is measurably better thanmod
(compiled with-O2
):Seeing as I can't add comments yet, I'll do a little bit more work and just analyze all of them. I'm putting the analysis at the top; however, the relevant data is below. (Note: all of this is done in 6.12.3 as well - no GHC 7 magic yet.)
Analysis:
Version 1: show is pretty good for ints, especially those as short as we have. Making strings actually tends to be decent in GHC; however reading to strings and writing large strings to files (or stdout, although you wouldn't want to do that) are where your code can absolutely crawl. I would suspect that a lot of the details behind why this is so fast are due to clever optimizations within show for Ints.
Version 2: This one was the slowest of the bunch when compiled. Some problems: reverse is strict in its argument. What this means is that you don't benefit from being able to perform computations on the first part of the list while you're computing the next elements; you have to compute them all, flip them, and then do your computations (namely (`mod` 10) ) on the elements of the list. While this may seem small, it can lead to greater memory usage (note the 5GB of heap memory allocated here as well) and slower computations. (Long story short: don't use reverse.)
Version 3: Remember how I just said don't use reverse? Turns out, if you take it out, this one drops to 1.79s total execution time - barely slower than the baseline. The only problem here is that as you go deeper into the number, you're building up the spine of the list in the wrong direction (essentially, you're consing "into" the list with recursion, as opposed to consing "onto" the list).
Version 4: This is a very clever implementation. You benefit from several nice things: for one, quotRem should use the Euclidean algorithm, which is logarithmic in its larger argument. (Maybe it's faster, but I don't believe there's anything that's more than a constant factor faster than Euclid.) Furthermore, you cons onto the list as discussed last time, so that you don't have to resolve any list thunks as you go - the list is already entirely constructed when you come back around to parse it. As you can see, the performance benefits from this.
This code was probably the slowest in GHCi because a lot of the optimizations performed with the -O3 flag in GHC deal with making lists faster, whereas GHCi wouldn't do any of that.
Lessons: cons the right way onto a list, watch for intermediate strictness that can slow down computations, and do some legwork in looking at the fine-grained statistics of your code's performance. Also compile with the -O3 flags: whenever you don't, all those people who put a lot of hours into making GHC super-fast get big ol' puppy eyes at you.
Data:
I just took all four functions, stuck them into one .hs file, and then changed as necessary to reflect the function in use. Also, I bumped your limit up to 5e6, because in some cases compiled code would run in less than half a second on 1e6, and this can start to cause granularity problems with the measurements we're making.
Compiler options: use ghc --make -O3 [filename].hs to have GHC do some optimization. We'll dump statistics to standard error using digits +RTS -sstderr.
Dumping to -sstderr gives us output that looks like this, in the case of digits1:
There are three key statistics here:
Alright, let's move on to version 2.
Alright, so we're seeing an interesting pattern.
Version 3:
Alright, so we're seeing some strange patterns.
And finally, version 4:
Wowza! Let's break it down: