Making a single function work on lists, ByteString

2019-02-09 05:24发布

问题:

I'm writing a function that does some searching in a sequence of arbitrary symbols. I'd like to make it generic enough so that it works on lists, Foldables as well on ByteStrings and Texts. Generalizing it to Foldable is simple. But how to include ByteStrings and Texts? Sure I could convert ByteString into a list and then call my function, but I'd lose all the advantages ByteStrings.

To have a concrete example let's say we want to make a histogram function:

import Control.Monad.State
import qualified Data.Foldable as F
import Data.Map.Strict (Map)
import qualified Data.Map.Strict as Map
import Data.Word
import qualified Data.ByteString as B
import qualified Data.Text as T

type Histogram a = Map a Int

empty :: (Ord a) => Histogram a
empty = Map.empty

histogramStep :: (Ord a) => a -> Histogram a -> Histogram a
histogramStep k = Map.insertWith (+) k 1

histogram :: (Ord a, F.Foldable t) => t a -> Histogram a
histogram = F.foldl (flip histogramStep) empty

But since neither ByteString nor Text can be Foldable (it stores just Word8s/Chars, not arbitrary elements), I'm stuck with creating more functions that look exactly like the one before, just with a different type signatures:

histogramBS :: B.ByteString -> Histogram Word8
histogramBS = B.foldl (flip histogramStep) empty

histogramText :: T.Text -> Histogram Char
histogramText = T.foldl (flip histogramStep) empty

This is something one does not expect in a functional language like Haskell.

How to make it generic, to write histogram once and for all?

回答1:

Your solution is pretty much what the ListLike package does. There's also the additional package listlike-instances which adds instances for Text and Vector.



回答2:

After a while I made a solution myself, but I'm not sure if it could be solved in a better way, or if someone already did this in some library.

I created a type-class with TypeFamilies as

class Foldable' t where
    type Element t :: *
    foldlE :: (b -> Element t -> b) -> b -> t -> b
    -- other functions could be copied here from Foldable

and instances:

newtype WrapFoldable f a = WrapFoldable { unwrapFoldable :: f a }
instance (F.Foldable f) => Foldable' (WrapFoldable f a) where
    type Element (WrapFoldable f a) = a
    foldlE f z = F.foldl f z . unwrapFoldable

instance Foldable' B.ByteString where
    type Element B.ByteString = Word8
    foldlE = B.foldl


instance Foldable' T.Text where
    type Element (T.Text) = Char
    foldlE = T.foldl

or even better with FlexibleInstances:

instance (F.Foldable t) => Foldable' (t a) where
    type Element (t a) = a
    foldlE = F.foldl

Now I can write (with FlexibleContexts):

histogram :: (Ord (Element t), Foldable' t) => t -> Histogram (Element t)
histogram = foldlE (flip histogramStep) empty

and use it on Foldables, ByteStrings, Texts etc.

  • Is there another (perhaps simpler) way to do it?
  • Is there some library that addresses this problem (in this way or another)?


回答3:

You might consider objectifying folds themselves:

{-# LANGUAGE GADTs #-}
import Data.List (foldl', unfoldr)
import qualified Data.ByteString.Lazy as B
import qualified Data.Vector.Unboxed as V
import qualified Data.Text as T
import qualified Data.Map as Map
import Data.Word
type Histogram a = Map.Map a Int

empty :: (Ord a) => Histogram a
empty = Map.empty
histogramStep :: (Ord a) => Histogram a -> a -> Histogram a
histogramStep h k = Map.insertWith (+) k 1 h 

histogram :: Ord b => Fold b (Histogram b)
histogram = Fold histogramStep empty id

histogramT :: T.Text -> Histogram Char
histogramT = foldT histogram
histogramB :: B.ByteString -> Histogram Word8
histogramB = foldB histogram 
histogramL :: Ord b => [b] -> Histogram b
histogramL = foldL histogram

-- helper library
-- see http://squing.blogspot.fr/2008/11/beautiful-folding.html
-- note existential type
data Fold b c where  Fold ::  (a -> b -> a) -> !a -> (a -> c) -> Fold b c
instance Functor (Fold b) where  fmap f (Fold op x g) = Fold op x (f . g)

foldL :: Fold b c -> [b] -> c
foldL (Fold f x c) bs = c $ (foldl' f x bs)

foldV :: V.Unbox b => Fold b c -> V.Vector b -> c
foldV (Fold f x c) bs = c $ (V.foldl' f x bs)

foldT :: Fold Char t -> T.Text -> t
foldT (Fold f x c) t = c $ (T.foldl' f x t)

foldB :: Fold Word8 t -> B.ByteString -> t
foldB (Fold f x c) t = c $ (B.foldl' f x t)


sum_, product_ :: Num a => Fold a a
sum_ = Fold (+) 0 id
product_ = Fold (*) 1 id

length_ :: Fold a Int
length_ = Fold (const . (+1)) 0 id
maximum_ = Fold max 0 id


回答4:

I found another solution using lens package, which has a detailed type-class hierarchy identifying different kind of data structures. Its approach is similar to the one in applicative's answer - it objectifies folds:

{-# LANGUAGE RankNTypes #-}
import Control.Monad.State
import qualified Data.Foldable as F
import Data.Map.Strict (Map)
import qualified Data.Map.Strict as Map
import Data.Word
import qualified Data.ByteString as B
import qualified Data.Text as T

import Control.Lens.Fold
import qualified Data.ByteString.Lens as LBS
import qualified Data.Text.Lens as LT

type Histogram a = Map a Int

empty :: (Ord a) => Histogram a
empty = Map.empty

histogramStep :: (Ord a) => a -> Histogram a -> Histogram a
histogramStep k = Map.insertWith (+) k 1

-- Histogram on anything that can be folded into `a`:

histogram :: (Ord a) => Fold c a -> c -> Histogram a
histogram f = foldlOf f (flip histogramStep) empty

-- Specializations are simple:

histogramF :: (Ord a, F.Foldable t) => t a -> Histogram a
histogramF = histogram folded

histogramBS :: B.ByteString -> Histogram Word8
histogramBS = histogram LBS.bytes

histogramText :: T.Text -> Histogram Char
histogramText = histogram LT.text