haskell - 我重写的 foldl 函数优化了吗？

Question

两天前我刚开始使用 Haskell，所以我还不确定如何优化我的代码。

作为练习，我重写了foldland foldr(我会foldl在这里给出，但是foldr是一样的，用lastandhead替换init) tail。

代码是：

module Main where

    myFoldl :: ( a -> ( b -> a ) ) -> a -> ( [b] -> a )

    myFoldl func = ( \x -> (\theList
        -> if (length theList == 0) then
            x
        else
            myFoldl func (func x (last theList) ) (init theList)
        ) )

我唯一担心的是我怀疑 Haskell 不能在这里应用尾调用优化，因为递归调用不是在函数末尾进行的。

我怎样才能优化这个尾调用？Haskell 的内置foldl实现与我的实现方式不同吗？

score 26 · Accepted Answer

您在代码示例中使用括号以及对尾递归的强调表明您是从 Lisp 或 Scheme 来到 Haskell。如果您从像 Scheme 这样的热切语言来到 Haskell，请注意：尾调用在 Haskell 中的性能预测不如热切语言中的预测。您可以拥有由于惰性而在线性空间中执行的尾递归函数，并且您可以拥有由于惰性而在恒定空间中执行的非尾递归函数。（已经糊涂了？）

您定义中的第一个缺陷是使用length theList == 0. 这会强制评估列表的整个脊椎，并且是 O(n) 时间。最好使用模式匹配，就像foldl在 Haskell 中这个朴素的定义中一样：

foldl :: (b -> a -> b) -> b -> [a] -> b
foldl f z [] = z
foldl f z (x:xs) = foldl f (f z x) xs

然而，这在 Haskell 中表现得很糟糕，因为在调用者要求结果之前，我们实际上并不计算该f z x部分；foldl因此，这foldl会为每个列表元素在内存中累积未评估的 thunk，并且不会从尾递归中获得任何好处。事实上，尽管是尾递归的，这种foldl对长列表的幼稚可能会导致堆栈溢出！（该Data.List模块有一个foldl'没有这个问题的功能。）

与此相反，许多 Haskell 非尾递归函数执行得非常好。例如find，根据的非尾递归定义，采用的定义foldr：

find :: (a -> Boolean) -> [a] -> Maybe a
find pred xs = foldr find' Nothing xs
    where find' elem rest = if pred elem then Just elem else rest

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f z [] = z
foldr f z (x:xs) = f x (subfold xs)
    where subfold = foldr f z

由于惰性，这实际上是在线性时间和恒定空间中执行的。为什么？

一旦找到满足谓词的元素，就无需遍历列表的其余部分来计算的值rest。
一旦您查看一个元素并确定它不匹配，就无需保留有关该元素的任何数据。

我现在要传授的教训是：不要将你对热切语言的性能假设引入 Haskell。你才两天；首先专注于理解语言的语法和语义，不要强迫自己编写优化版本的函数。一开始你会foldl时不时地受到 -style 堆栈溢出的打击，但你会及时掌握它。

编辑 [2012 年 9 月 5 日]：更简单的演示，find尽管不是尾递归，但惰性在恒定空间中运行。首先，简化定义：

foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f z [] = z
foldr f z (x:xs) = f x (foldr f z xs)

find :: (a -> Bool) -> [a] -> Maybe a
find p xs = let step x rest = if p x then Just x else rest
            in foldr step Nothing xs

现在，使用等式推理（即，根据上面的定义，用 equals 代替 equals），并以惰性顺序（最外层优先）进行评估，让我们计算find (==400) [1..]：

find (==400) [1..]
    -- Definition of `find`:
    => let step x rest = if x == 400 then Just x else rest
       in foldr step Nothing [1..]

    -- `[x, y, ...]` is the same as `x:[y, ...]`:
    => let step x rest = if x == 400 then Just x else rest
       in foldr step Nothing (1:[2..])

    -- Using the second equation in the definition of `foldr`:
    => let step x rest = if x == 400 then Just x else rest
       in step 1 (foldr step Nothing [2..])

    -- Applying `step`:
    => let step x rest = if x == 400 then Just x else rest
       in if 1 == 400 then Just 1 else foldr step Nothing [2..]

    -- `1 == 400` is `False`
    => let step x rest = if x == 400 then Just x else rest
       in if False then Just 1 else foldr step Nothing [2..]

    -- `if False then a else b` is the same as `b`
    => let step x rest = if x == 400 then Just x else rest
       in foldr step Nothing [2..]

    -- Repeat the same reasoning steps as above 
    => let step x rest = if x == 400 then Just x else rest
       in foldr step Nothing (2:[3..])
    => let step x rest = if x == 400 then Just x else rest
       in step 2 (foldr step Nothing [3..])
    => let step x rest = if x == 400 then Just x else rest
       in if 2 == 400 then Just 2 else foldr step Nothing [3..]
    => let step x rest = if x == 400 then Just x else rest
       in if False then Just 2 else foldr step Nothing [3..]
    => let step x rest = if x == 400 then Just x else rest
       in foldr step Nothing [3..]
        .
        .
        .

    => let step x rest = if x == 400 then Just x else rest
       in foldr step Nothing [400..]
    => let step x rest = if x == 400 then Just x else rest
       in foldr step Nothing (400:[401..])
    => let step x rest = if x == 400 then Just x else rest
       in step 400 (foldr step Nothing [401..])
    => let step x rest = if x == 400 then Just x else rest
       in if 400 == 400 then Just 400 else foldr step Nothing [401..]
    => let step x rest = if x == 400 then Just x else rest
       in if True then Just 400 else foldr step Nothing [401..]

    -- `if True then a else b` is the same as `a`
    => let step x rest = if x == 400 then Just x else rest
       in Just 400

    -- We can eliminate the `let ... in ...` here:
    => Just 400

请注意，随着我们继续遍历列表，连续评估步骤中的表达式不会变得越来越复杂或更长；第n步表达式的长度或深度与n不成比例，它基本上是固定的。这实际上演示了如何在恒定空间中懒惰地执行。find (==400) [1..]

score 14 · Accepted Answer

惯用的 Haskell 看起来与此非常不同，它避开了 if-then-else、嵌套的 lambda、括号和解构函数（头、尾）。相反，你会这样写：

foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f z0 xs0 = go z0 xs0
   where
       go z []     = z
       go z (x:xs) = go (f z x) xs

而是依赖于模式匹配、where 子句、尾递归、受保护的声明。

haskell - 我重写的 foldl 函数优化了吗？

2 回答 2

Related

Reference