c# - 这两个 linq 实现有什么区别？

Question

我正在阅读Jon Skeet 的 Reimplemnting Linq to Objects 系列。在where文章的实现中，我找到了以下片段，但我不明白将原始方法一分为二有什么好处。

原始方法：

// Naive validation - broken! 
public static IEnumerable<TSource> Where<TSource>( 
    this IEnumerable<TSource> source, 
    Func<TSource, bool> predicate) 
{ 
    if (source == null) 
    { 
        throw new ArgumentNullException("source"); 
    } 
    if (predicate == null) 
    { 
        throw new ArgumentNullException("predicate"); 
    } 
    foreach (TSource item in source) 
    { 
        if (predicate(item)) 
        { 
            yield return item; 
        } 
    } 
}

重构方法：

public static IEnumerable<TSource> Where<TSource>( 
    this IEnumerable<TSource> source, 
    Func<TSource, bool> predicate) 
{ 
    if (source == null) 
    { 
        throw new ArgumentNullException("source"); 
    } 
    if (predicate == null) 
    { 
        throw new ArgumentNullException("predicate"); 
    } 
    return WhereImpl(source, predicate); 
} 

private static IEnumerable<TSource> WhereImpl<TSource>( 
    this IEnumerable<TSource> source, 
    Func<TSource, bool> predicate) 
{ 
    foreach (TSource item in source) 
    { 
        if (predicate(item)) 
        { 
            yield return item; 
        } 
    } 
}

乔恩说——它是为了急切地验证，然后推迟剩下的部分。但是，我不明白。

有人可以更详细地解释一下，这两个函数之间有什么区别，为什么验证会在一个而不是另一个中急切地执行？

结论/解决方案：

由于我对哪些函数被确定为迭代器生成器缺乏了解，我感到困惑。我假设，它基于像 IEnumerable<T>这样的方法的签名。但是，根据答案，现在我明白了，如果一个方法使用yield 语句，它就是一个迭代器生成器。

score 5 · Accepted Answer

损坏的代码是一个方法，实际上是一个迭代器生成器。这意味着它最初只是返回一个状态机而不做任何事情。只有当调用代码调用MoveNext（可能作为 for-each 循环的一部分）时，它才会执行从开始到第一个 yield-return 的所有内容。

使用正确的代码，Where不是迭代器生成器。这意味着它会立即执行所有操作，就像平常一样。只是WhereImpl。因此验证会立即执行，但WhereImpl直到第一次收益返回（包括第一次收益返回）的代码被延迟。

所以如果你有类似的东西：

IEnumerable<int> evens = list.Where(null); // Correct code gives error here.
foreach(int i in evens) // Broken code gives it here.

在您开始迭代之前，损坏的版本不会给您错误。

score 2 · Accepted Answer

我认为 Jon 在他的文章中解释得很好，但是解释依赖于你理解当有yield语句时编译器是如何生成代码的。基本上发生的事情是编译器生成一个迭代器，直到需要迭代中的一项时才会调用（延迟执行）。初始方法包含检查参数的代码和迭代代码。编译器将所有这些捆绑到迭代器中，记住，在需要第一项之前不会调用迭代器。这意味着在您尝试访问可枚举项中的一项之前，不会发生验证。

通过将其分为两种方法，一种包含验证，另一种包含迭代器块，它确保验证代码在迭代器构建时运行，而不是在执行时运行。这是因为绑定到迭代器中的唯一代码是第二种方法中的代码；它是唯一延迟执行的代码。验证代码在您创建迭代器时执行。

c# - 这两个 linq 实现有什么区别？

2 回答 2

Related

Reference