2

我试图了解如何Numeric.AD在 Haskell 中使用(自动微分)。

我定义了一个简单的矩阵类型和一个以数组和两个矩阵作为参数的标量函数。我想使用 AD 来获取评分函数相对于两个矩阵的梯度,但我遇到了编译问题。这是代码

{-# LANGUAGE DeriveTraversable, DeriveFunctor, DeriveFoldable #-}
import Numeric.AD.Mode.Reverse as R
import Data.Traversable as T
import Data.Foldable as F

--- Non-linear function on "vectors"
logistic x = 1.0 / (1.0 + exp(-x) )
phi v = map logistic v
phi' (x:xs) = x : (phi xs)

--- dot product
dot u v = foldr (+) 0 $ zipWith (*) u v

--- simple matrix type
data Matrix a = M [[a]] deriving (Eq,Show,Functor,F.Foldable,T.Traversable)

--- action of a matrix on a vector
mv _ [] = []
mv (M []) _ = []
mv ( M m ) v = ( dot (head m)  v ) :  (mv (M (tail m)) v )

--- two matrices
mbW1 = M $ [[1,0,0],[-1,5,1],[1,2,-3]]
mbW2 = M $ [[0,0,0],[1,3,-1],[-2,4,6]]

--- two different scoring functions
sc1 v m = foldr (+) 0 $ (phi' . (mv m) )  v  

sc2 :: Floating a => [a] -> [Matrix a] -> a
sc2 v [m1, m2] = foldr (+) 0 $ (phi' . (mv m2) . phi' . (mv m1) ) v

strToInt = read :: String -> Double
strLToIntL = map strToInt
--- testing
main = do
        putStrLn $ "mbW1:" ++ (show mbW1)
        putStrLn $ "mbW2:" ++ (show mbW2)
        rawInput <-  readFile "/dev/stdin"
        let xin= strLToIntL $ lines rawInput
        putStrLn "sc xin mbW1"
        print $ sc1 xin mbW1  --- ok. = 
        putStrLn "grad (sc1 xin) mbW1"
        print $ grad ( sc1 xin) mbW1   -- yields an error: expects xin [Reverse s Double] instead of [Double]
        putStrLn "grad (sc1 [3,5,7]) mbW1"
        print $ grad ( sc1 [3,5,7]) mbW1   --- ok. =
        putStrLn "sc2 xin [mbW1,mbW2]"
        print $ sc2 xin [mbW1, mbW2]
        putStrLn "grad (sc2 [3,5,7) [mbW1,mbW2]"
        print $ grad ( sc2 [3,5,7]) [mbW1, mbW2]  --- Error: see text

最后一行(sc2 上的 grad)给出以下错误:

Couldn't match type ‘Reverse s (Matrix Double)’
               with ‘Matrix (Reverse s (Matrix Double))’
Expected type: [Reverse s (Matrix Double)]
               -> Reverse s (Matrix Double)
  Actual type: [Matrix (Reverse s (Matrix Double))]
               -> Reverse s (Matrix Double)
In the first argument of ‘grad’, namely ‘(sc2 [3, 5, 7])’
In the second argument of ‘($)’, namely
  ‘grad (sc2 [3, 5, 7]) [mbW1, mbW2]’

我不明白实际看到的类型中的“矩阵矩阵”来自哪里。我正在grad使用 sc2 的 curried 版本,使其成为 Matrix 列表中的一个函数。

注释掉两条违规行运行没有问题,即第一个梯度有效并且被正确计算(我将 [1,2,3] 作为程序的输入提供):

mbW1:M [[1.0,0.0,0.0],[-1.0,5.0,1.0],[1.0,2.0,-3.0]]
mbW2:M [[0.0,0.0,0.0],[1.0,3.0,-1.0],[-2.0,4.0,6.0]]
sc1 xin mbW1
1
2
3
2.0179800657874893
grad (sc1 [3,5,7]) mbW1
M [[3.0,5.0,7.0],[7.630996942126885e-13,1.2718328236878141e-12,1.7805659531629398e-12],[1.0057130122694228e-3,1.6761883537823711e-3,2.3466636952953197e-3]]
sc2 xin [mbW1,mbW2]
1.8733609463863194

这两个错误都是一个问题。我想取任何此类sc2评分函数的梯度,具体取决于在任何给定“点”xin 处评估的矩阵数组。显然,我对 AD 库的理解还不够好。任何帮助,将不胜感激。

4

0 回答 0