-->

Function like catMaybes, but counting Nothing valu

2019-06-15 13:49发布

问题:

I have a list like this:

let foo = [Just 1, Just 2, Nothing, Just 3, Nothing, Nothing]

By using catMaybes I can extract only the Just-constructed values:

catMaybes foo -- [1,2,3]

I'm now looking for a function that not only yields a list of Justs but also a count of Nothings for a finite list by traversing it once. It should have a signature like this:

catMaybesCount :: [Maybe a] -> ([a], Int)

Note: This question was answered Q&A-style and therefore intentionally does not show any research effort!

回答1:

import Data.Monoid
import Data.Foldable

catMaybesCount = foldMap inject where
    inject Nothing  = ([ ], Sum 1)
    inject (Just x) = ([x], Sum 0)


回答2:

We can have a left folding for strict counting and a right folding for lazy list building at the same time:

catMC :: (Num t) => [Maybe a] -> ([a], t)
catMC xs = g 0 xs 
  where 
    g !c (Nothing:xs) = g (c+1) xs 
    g !c (Just  v:xs) = let (a,b)=g c xs in (v:a,b) 
    g !c []           = ([],c)

This works for infinite lists too, as long as we don't access the count field (snd) of the result, while calculating the count in a strict, efficient manner, much like a mutable accumulator variable.



回答3:

My choice would be just a foldr:

import Control.Arrow

catMaybesCount :: [Maybe a] -> ([a], Int)
catMaybesCount = foldr (maybe (second succ) (first . (:))) ([], 0)

There are pros and cons to both left and right folds in this case, as right fold makes the list result properly lazy and efficient, while strict left folding computes the length result more efficiently.



回答4:

I'd use the Writer monad:

import Control.Arrow ( (***) )
import Data.Monoid ( Sum(..) )
import Control.Monad.Writer ( execWriter, tell )

catMaybesCount xs = (id *** getSum) $ execWriter (mapM_ go xs)
   where
      go (Just v) = tell ([v], Sum 0)
      go Nothing = tell ([], Sum 1)

The (++)s should right-associate given the definition of mapM_.



回答5:

The most naive solution would be to simply do both evalutations independently:

catMaybesCount :: [Maybe a] -> ([a], Int)
catMaybesCount xs = (catMaybes xs, length $ filter isNothing xs)

I don't know if GHC is able to optimize this properly, but the length . filter p solution for counting Nothings has some peculiarities anyway (see this SO post for an overview).

Theoretically, this solution could require two passes over the list, instead of the one

This is a recursive solution solving this issue I came up with:

import Data.Maybe

-- | Equivalent to @catMaybes@, but additonally counts @Nothing@ values
catMaybesCount :: [Maybe a] -> ([a], Int)
catMaybesCount xs =  catMaybesCountWorker xs [] 0

-- | Worker function for @catMaybesCount@   
catMaybesCountWorker :: [Maybe a] -> [a] -> Int -> ([a], Int)
catMaybesCountWorker [] justs cnt = (justs, cnt)
catMaybesCountWorker (Nothing:xs) justs cnt =
    catMaybesCountWorker xs justs (cnt + 1)
catMaybesCountWorker ((Just v):xs) justs cnt =
    catMaybesCountWorker xs (justs ++ [v]) cnt

As applying it to a list should evaluate the list only once, this should be more efficient.

However I am worried about the justs ++ [v] anti-idiom, as (:) would be more efficient (see this discussion). However, this would invert the resulting list. Maybe someone with more knowledge on this topic could have a look at it?

Note that this function won't terminate for infinite lists because the Nothing count will never finish to evaluate.



回答6:

For cases like this, the foldl package by Gabriel Gonzalez is very handy. You can simply use the predefined folds or define custom ones like below and combine them using an applicative interface:

import qualified Control.Foldl as L
import Control.Applicative ((<$>),(<*>))
import Data.Monoid
import qualified Data.DList as DL

catMaybesCount :: [Maybe a] -> ([a], Int)
catMaybesCount = L.fold $ (,) <$> elemJust <*> countJust

-- L.Fold :: (x -> a -> x) -> x -> (x -> b) -> L.Fold a b

elemJust :: L.Fold (Maybe a) [a]
elemJust = L.Fold step DL.empty DL.toList
  where step xs (Just x) = DL.snoc xs x
        step xs Nothing = xs

countJust :: L.Fold (Maybe a) Int
countJust = L.Fold step (Sum 0) getSum
  where step acc (Just _) = acc
        step acc Nothing = acc <> Sum 1