How does deriving work in Haskell?

2020-01-26 01:48发布

问题:

Algebraic Data Types (ADTs) in Haskell can automatically become instances of some typeclasses (like Show, Eq) by deriving from them.

data  Maybe a  =  Nothing | Just a
  deriving (Eq, Ord)

My question is, how does this deriving work, i.e. how does Haskell know how to implement the functions of the derived typeclass for the deriving ADT?

Also, why is deriving restricted to certain typeclasses only? Why can't I write my own typeclass which can be derived?

回答1:

The short answer is, magic :-). This is to say that automatic deriving is baked into the Haskell spec, and every compiler can choose to implement it in its own way. There's lots of work on how to make it extensible however.

Derive is a tool for Haskell to let you write your own deriving mechanisms.

GHC used to provide a derivable type class extension called Generic Classes, but it was rarely used, as it was somewhat weak. That has now been taken out, and work is ongoing to integrate a new generic deriving mechanism as described in this paper: http://www.dreixel.net/research/pdf/gdmh.pdf

For more on this, see:

  • GHC wiki: http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/GenericDeriving
  • Haskell wiki: http://www.haskell.org/haskellwiki/Generics
  • Hackage: http://hackage.haskell.org/package/generic-deriving


回答2:

From the Haskell 98 report:

The only classes in the Prelude for which derived instances are allowed are Eq, Ord, Enum, Bounded, Show, and Read...

Here's the description of how to derive these type classes: http://www.haskell.org/onlinereport/derived.html#derived-appendix



回答3:

It is possible to use Template Haskell to generate instance declarations in a similar way to deriving-clauses.

The following example is shamelessly stolen from the Haskell Wiki:

In this example we use the following Haskell code

$(gen_render ''Body)

to produce the following instance:

instance TH_Render Body where
  render (NormalB exp) = build 'normalB exp
  render (GuardedB guards) = build 'guardedB  guards

The function gen_render above is defined as follows. (Note that this code must be in separate module from the above usage).

-- Generate an intance of the class TH_Render for the type typName
gen_render :: Name -> Q [Dec]
gen_render typName =
  do (TyConI d) <- reify typName -- Get all the information on the type
     (type_name,_,_,constructors) <- typeInfo (return d) -- extract name and constructors                  
     i_dec <- gen_instance (mkName "TH_Render") (conT type_name) constructors
                      -- generation function for method "render"
                      [(mkName "render", gen_render)]
     return [i_dec]  -- return the instance declaration
             -- function to generation the function body for a particular function
             -- and constructor
       where gen_render (conName, components) vars 
                 -- function name is based on constructor name  
               = let funcName = makeName $ unCapalize $ nameBase conName 
                 -- choose the correct builder function
                     headFunc = case vars of
                                     [] -> "func_out"
                                     otherwise -> "build" 
                      -- build 'funcName parm1 parm2 parm3 ...
                   in appsE $ (varE $ mkName headFunc):funcName:vars -- put it all together
             -- equivalent to 'funcStr where funcStr CONTAINS the name to be returned
             makeName funcStr = (appE (varE (mkName "mkName")) (litE $ StringL funcStr))

Which uses the following functions and types.

First some type synonyms to make the code more readable.

type Constructor = (Name, [(Maybe Name, Type)]) -- the list of constructors
type Cons_vars = [ExpQ] -- A list of variables that bind in the constructor
type Function_body = ExpQ 
type Gen_func = Constructor -> Cons_vars -> Function_body
type Func_name = Name   -- The name of the instance function we will be creating
-- For each function in the instance we provide a generator function
-- to generate the function body (the body is generated for each constructor)
type Funcs = [(Func_name, Gen_func)]

The main reusable function. We pass it the list of functions to generate the functions of the instance.

-- construct an instance of class class_name for type for_type
-- funcs is a list of instance method names with a corresponding
-- function to build the method body
gen_instance :: Name -> TypeQ -> [Constructor] -> Funcs -> DecQ
gen_instance class_name for_type constructors funcs = 
  instanceD (cxt [])
    (appT (conT class_name) for_type)
    (map func_def funcs) 
      where func_def (func_name, gen_func) 
                = funD func_name -- method name
                  -- generate function body for each constructor
                  (map (gen_clause gen_func) constructors)

A helper function of the above.

-- Generate the pattern match and function body for a given method and
-- a given constructor. func_body is a function that generations the
-- function body
gen_clause :: (Constructor -> [ExpQ] -> ExpQ) -> Constructor -> ClauseQ
gen_clause func_body data_con@(con_name, components) = 
      -- create a parameter for each component of the constructor
   do vars <- mapM var components
      -- function (unnamed) that pattern matches the constructor 
      -- mapping each component to a value.
      (clause [(conP con_name (map varP vars))]
            (normalB (func_body data_con (map varE vars))) [])
       -- create a unique name for each component. 
       where var (_, typ) 
                 = newName 
                   $ case typ of 
                     (ConT name) -> toL $ nameBase name
                     otherwise   -> "parm"
               where toL (x:y) = (toLower x):y

unCapalize :: [Char] -> [Char]
unCapalize (x:y) = (toLower x):y

And some borrowed helper code taken from Syb III / replib 0.2.

typeInfo :: DecQ -> Q (Name, [Name], [(Name, Int)], [(Name, [(Maybe Name, Type)])])
typeInfo m =
     do d <- m
        case d of
           d@(DataD _ _ _ _ _) ->
            return $ (simpleName $ name d, paramsA d, consA d, termsA d)
           d@(NewtypeD _ _ _ _ _) ->
            return $ (simpleName $ name d, paramsA d, consA d, termsA d)
           _ -> error ("derive: not a data type declaration: " ++ show d)

     where
        consA (DataD _ _ _ cs _)    = map conA cs
        consA (NewtypeD _ _ _ c _)  = [ conA c ]

        {- This part no longer works on 7.6.3
        paramsA (DataD _ _ ps _ _) = ps
        paramsA (NewtypeD _ _ ps _ _) = ps
        -}

        -- Use this on more recent GHC rather than the above
        paramsA (DataD _ _ ps _ _) = map nameFromTyVar ps
        paramsA (NewtypeD _ _ ps _ _) = map nameFromTyVar ps

        nameFromTyVar (PlainTV a) = a
        nameFromTyVar (KindedTV a _) = a


        termsA (DataD _ _ _ cs _) = map termA cs
        termsA (NewtypeD _ _ _ c _) = [ termA c ]

        termA (NormalC c xs)        = (c, map (\x -> (Nothing, snd x)) xs)
        termA (RecC c xs)           = (c, map (\(n, _, t) -> (Just $ simpleName n, t)) xs)
        termA (InfixC t1 c t2)      = (c, [(Nothing, snd t1), (Nothing, snd t2)])

        conA (NormalC c xs)         = (simpleName c, length xs)
        conA (RecC c xs)            = (simpleName c, length xs)
        conA (InfixC _ c _)         = (simpleName c, 2)

        name (DataD _ n _ _ _)      = n
        name (NewtypeD _ n _ _ _)   = n
        name d                      = error $ show d

simpleName :: Name -> Name
simpleName nm =
   let s = nameBase nm
   in case dropWhile (/=':') s of
        []          -> mkName s
        _:[]        -> mkName s
        _:t         -> mkName t