loops - OCaml 中退出循环的惯用异常

Question

在 OCaml 中，可以通过引发异常提前退出命令式循环。

虽然在 OCaml 中使用命令式循环本身并不是惯用的，但我想知道用早期退出来模拟命令式循环的最惯用方法是什么（如果可能的话，考虑到性能等方面）。

例如，一个旧的 OCaml 常见问题解答提到异常Exit：

Exit: 用于跳出循环或函数。

它仍然是最新的吗？标准库只是将其作为通用异常提及：

任何库Exit函数都不会引发异常。它是为在您的程序中使用而提供的。

相关地，this answer to another question提到使用预先计算的let exit = Exit异常来避免循环内的分配。还需要吗？

此外，有时人们想以特定值退出循环，例如raise (Leave 42). 是否有惯用的例外或命名约定来执行此操作？在这种情况下我应该使用引用（例如let res = ref -1 in ... <loop body> ... res := 42; raise Exit）吗？

最后，使用Exitin 嵌套循环可以防止某些情况下想要退出多个循环，例如break <label>在 Java 中。这将需要定义具有不同名称的异常，或者至少使用一个整数来指示应该退出多少范围（例如Leave 2，指示应该退出 2 个级别）。同样，这里是否有一种惯用的方法/异常命名？

score 5 · Accepted Answer

正如最初在评论中发布的那样，在 OCaml 中提前退出的惯用方法是使用延续。在您希望提前返回的地方，您创建一个延续，并将其传递给可能提前返回的代码。这比循环的标签更通用，因为您几乎可以从任何可以访问延续的东西中退出。

此外，正如评论中所发布的，请注意raise_notrace您永远不希望运行时生成其跟踪的异常的用法。

“天真”的第一次尝试：

module Continuation :
sig
  (* This is the flaw with this approach: there is no good choice for
     the result type. *)
  type 'a cont = 'a -> unit

  (* with_early_exit f passes a function "k" to f. If f calls k,
     execution resumes as if with_early_exit completed
     immediately. *)
  val with_early_exit : ('a cont -> 'a) -> 'a
end =

struct
  type 'a cont = 'a -> unit

  (* Early return is implemented by throwing an exception. The ref
     cell is used to store the value with which the continuation is
     called - this is a way to avoid having to generate an exception
     type that can store 'a for each 'a this module is used with. The
     integer is supposed to be a unique identifier for distinguishing
     returns to different nested contexts. *)
  type 'a context = 'a option ref * int64
  exception Unwind of int64

  let make_cont ((cell, id) : 'a context) =
    fun result -> cell := Some result; raise_notrace (Unwind id)

  let generate_id =
    let last_id = ref 0L in
    fun () -> last_id := Int64.add !last_id 1L; !last_id

  let with_early_exit f =
    let id = generate_id () in
    let cell = ref None in
    let cont : 'a cont = make_cont (cell, id) in
    try
      f cont
    with Unwind i when i = id ->
      match !cell with
      | Some result -> result
        (* This should never happen... *)
      | None        -> failwith "with_early_exit"
end



let _ =
  let nested_function i k = k 15; i in

  Continuation.with_early_exit (nested_function 42)
  |> string_of_int
  |> print_endline

可以看到，上面通过隐藏异常实现了提前退出。延续实际上是一个部分应用的函数，它知道创建它的上下文的唯一 id，并且有一个引用单元格来存储结果值，同时将异常抛出到该上下文。上面的代码打印 15。您可以根据需要传递延续k。您还可以f在传递给的位置立即定义函数with_early_exit，从而产生类似于在循环上添加标签的效果。我经常使用这个。

'a cont上面的问题是我任意设置的结果类型unit。实际上，一个类型的函数'a cont永远不会返回，所以我们希望它表现得像raise——在任何需要类型的地方都可以使用。但是，这不会立即起作用。如果您执行类似的操作type ('a, 'b) cont = 'a -> 'b，并将其传递给您的嵌套函数，类型检查器将'b在一个上下文中推断出一个类型，然后强制您仅在具有相同类型的上下文中调用延续，即您将无法做类似的事情

(if ... then 3 else k 15)
...
(if ... then "s" else k 16)

因为第一个表达式强制'b存在int，但第二个表达式需要'b存在string。

为了解决这个问题，我们需要提供一个类似于raise提前返回的函数，即

(if ... then 3 else throw k 15)
...
(if ... then "s" else throw k 16)

这意味着远离纯粹的延续。我们必须在make_cont上面取消部分应用（我将其重命名为throw），并改为传递裸上下文：

module BetterContinuation :
sig
  type 'a context

  val throw : 'a context -> 'a -> _
  val with_early_exit : ('a context -> 'a) -> 'a
end =

struct
  type 'a context = 'a option ref * int64
  exception Unwind of int64

  let throw ((cell, id) : 'a context) =
    fun result -> cell := Some result; raise_notrace (Unwind id)

  let generate_id = (* Same *)

  let with_early_exit f =
    let id = generate_id () in
    let cell = ref None in
    let context = (cell, id) in
    try
      f context
    with Unwind i when i = id ->
      match !cell with
      | Some result -> result
      | None        -> failwith "with_early_exit"
end



let _ =
  let nested_function i k = ignore (BetterContinuation.throw k 15); i in

  BetterContinuation.with_early_exit (nested_function 42)
  |> string_of_int
  |> print_endline

该表达式throw k v可用于需要不同类型的上下文中。

我在我从事的一些大型应用程序中普遍使用这种方法。我什至更喜欢它而不是常规例外。我有一个更复杂的变体，其中with_early_exit的签名大致如下：

val with_early_exit : ('a context -> 'b) -> ('a -> 'b) -> 'b

其中第一个函数表示尝试做某事，第二个函数'a表示可能导致的类型错误的处理程序。与变体和多态变体一起，这为异常处理提供了更明确的类型。它对多态变体特别强大，因为编译器可以推断出错误变体集。

Jane Street 方法的效果与此处描述的相同，事实上，我之前有一个使用一流模块生成异常类型的实现。我不知道为什么我最终选择了这个——可能会有细微的差别:)

score 3 · Accepted Answer

只是为了回答我的问题的特定部分，而其他答案中没有提到：

...使用预先计算的 let exit = Exit 异常来避免循环内的分配。还需要吗？

Core_bench我使用on做了一些微基准测试4.02.1+fp，结果表明没有显着差异：当比较两个相同的循环时，一个包含exit在循环之前声明的本地，另一个没有它，时间差异很小。

raise Exit在这个例子中，和之间的差异raise_notrace Exit也很小，在某些运行中约为 2%，在其他运行中高达 7%，但它很可能在如此短的实验的误差范围内。

总体而言，我无法衡量任何明显的差异，因此除非有人有Exit/exit显着影响性能的示例，否则我更喜欢前者，因为它更清晰并避免创建几乎无用的变量。

最后，我还比较了两种习惯用法的区别：在退出循环之前使用对值的引用，或者创建包含返回值的特定异常类型。

参考结果值 + Exit：

 let res = ref 0 in
 let r =
   try
     for i = 0 to n-1 do
       if a.(i) = v then
        (res := v; raise_notrace Exit)
     done;
     assert false
   with Exit -> !res
 in ...

具有特定的异常类型：

 exception Res of int
 let r =
   try
     for i = 0 to n-1 do
       if a.(i) = v then
         raise_notrace (Res v)
     done;
     assert false
   with Res v -> v
 in ...

再一次，差异很小，并且在运行之间变化很大。总体而言，第一个版本（参考 + Exit）似乎有一点优势（快 0% 到 10%），但差异不足以推荐一个版本而不是另一个版本。

由于前者需要定义一个初始值（可能不存在）或使用选项类型来初始化引用，而后者需要为循环返回的每种类型的值定义一个新的异常，所以这里没有理想的解决方案。

score 1 · Accepted Answer

Exit没关系（我不确定我是否可以说它是惯用的）。raise_notrace但是，如果您使用的是最新的编译器（自 4.02 起），请确保您使用的是 .

更好的解决方案是使用with_returnOCaml Core library。它不会有任何范围问题，因为它会为每个嵌套创建一个全新的异常类型。

当然，你也可以达到同样的效果，或者只取Core的实现源码。

更惯用的是，不要使用异常来缩短迭代，并考虑使用现有算法（find、find_map、exists等），或者如果没有适合您的算法，则只编写递归函数。

score 1 · Accepted Answer

关于点

使用预先计算的 let exit = Exit 异常来避免循环内的分配。还需要吗？

对于最新版本的 OCaml ，答案是否定的。这是 OCaml 4.02.0 的变更日志的相关摘录。

PR#6203：常量异常构造函数不再分配 (Alain Frisch)

这是 PR6203：http ://caml.inria.fr/mantis/view.php?id=6203

loops - OCaml 中退出循环的惯用异常

4 回答 4

Related

Reference