Automatic differentiation (AD) with respect to lis

2019-07-09 05:18发布

问题:

I am trying to understand how can I use Numeric.AD (automatic differentiation) in Haskell.

I defined a simple matrix type and a scalar function taking an array and two matrices as arguments. I want to use AD to get the gradient of the scoring function with respect to both matrices, but I'm running into compilation problems. Here is the code

{-# LANGUAGE DeriveTraversable, DeriveFunctor, DeriveFoldable #-}
import Numeric.AD.Mode.Reverse as R
import Data.Traversable as T
import Data.Foldable as F

--- Non-linear function on "vectors"
logistic x = 1.0 / (1.0 + exp(-x) )
phi v = map logistic v
phi' (x:xs) = x : (phi xs)

--- dot product
dot u v = foldr (+) 0 $ zipWith (*) u v

--- simple matrix type
data Matrix a = M [[a]] deriving (Eq,Show,Functor,F.Foldable,T.Traversable)

--- action of a matrix on a vector
mv _ [] = []
mv (M []) _ = []
mv ( M m ) v = ( dot (head m)  v ) :  (mv (M (tail m)) v )

--- two matrices
mbW1 = M $ [[1,0,0],[-1,5,1],[1,2,-3]]
mbW2 = M $ [[0,0,0],[1,3,-1],[-2,4,6]]

--- two different scoring functions
sc1 v m = foldr (+) 0 $ (phi' . (mv m) )  v  

sc2 :: Floating a => [a] -> [Matrix a] -> a
sc2 v [m1, m2] = foldr (+) 0 $ (phi' . (mv m2) . phi' . (mv m1) ) v

strToInt = read :: String -> Double
strLToIntL = map strToInt
--- testing
main = do
        putStrLn $ "mbW1:" ++ (show mbW1)
        putStrLn $ "mbW2:" ++ (show mbW2)
        rawInput <-  readFile "/dev/stdin"
        let xin= strLToIntL $ lines rawInput
        putStrLn "sc xin mbW1"
        print $ sc1 xin mbW1  --- ok. = 
        putStrLn "grad (sc1 xin) mbW1"
        print $ grad ( sc1 xin) mbW1   -- yields an error: expects xin [Reverse s Double] instead of [Double]
        putStrLn "grad (sc1 [3,5,7]) mbW1"
        print $ grad ( sc1 [3,5,7]) mbW1   --- ok. =
        putStrLn "sc2 xin [mbW1,mbW2]"
        print $ sc2 xin [mbW1, mbW2]
        putStrLn "grad (sc2 [3,5,7) [mbW1,mbW2]"
        print $ grad ( sc2 [3,5,7]) [mbW1, mbW2]  --- Error: see text

The last line (grad on sc2) gives the following error:

Couldn't match type ‘Reverse s (Matrix Double)’
               with ‘Matrix (Reverse s (Matrix Double))’
Expected type: [Reverse s (Matrix Double)]
               -> Reverse s (Matrix Double)
  Actual type: [Matrix (Reverse s (Matrix Double))]
               -> Reverse s (Matrix Double)
In the first argument of ‘grad’, namely ‘(sc2 [3, 5, 7])’
In the second argument of ‘($)’, namely
  ‘grad (sc2 [3, 5, 7]) [mbW1, mbW2]’

I don't understand where the "Matrix of Matrix" in the actual type seen comes from. I'm feeding the grad with a curried version of sc2, making it a function on a list of Matrix.

Commenting out the two offending lines runs without problem, i.e., the first gradient works and is correctly calculated (I'm feeding [1,2,3] as input to the program):

mbW1:M [[1.0,0.0,0.0],[-1.0,5.0,1.0],[1.0,2.0,-3.0]]
mbW2:M [[0.0,0.0,0.0],[1.0,3.0,-1.0],[-2.0,4.0,6.0]]
sc1 xin mbW1
1
2
3
2.0179800657874893
grad (sc1 [3,5,7]) mbW1
M [[3.0,5.0,7.0],[7.630996942126885e-13,1.2718328236878141e-12,1.7805659531629398e-12],[1.0057130122694228e-3,1.6761883537823711e-3,2.3466636952953197e-3]]
sc2 xin [mbW1,mbW2]
1.8733609463863194

Both errors are an issue. I want to take the gradient of any such sc2 scoring function, depending on an array of matrices, evaluated at any given "point" xin. Clearly, I'm not yet understanding the AD library well enough. Any help would be appreciated.