c# - Skip（和类似功能，如 Take）的性能

Question

刚刚看了下.NET Framework的Skip/扩展方法的源码（关于类型），发现内部实现是和方法一起工作的：TakeIEnumerable<T>GetEnumerator

// .NET framework
    public static IEnumerable<TSource> Skip<TSource>(this IEnumerable<TSource> source, int count)  
    {
        if (source == null) throw Error.ArgumentNull("source"); 
        return SkipIterator<TSource>(source, count); 
    }

    static IEnumerable<TSource> SkipIterator<TSource>(IEnumerable<TSource> source, int count) 
    {
        using (IEnumerator<TSource> e = source.GetEnumerator()) 
        {
            while (count > 0 && e.MoveNext()) count--;
            if (count <= 0) 
            { 
                while (e.MoveNext()) yield return e.Current;
            } 
        } 
    }

假设我有IEnumerable<T>1000 个元素（基础类型是List<T>）。如果我在做 list.Skip(990).Take(10) 会发生什么？它会在取最后 10 个元素之前迭代 990 个第一个元素吗？（这就是我的理解）。如果是，那么我不明白为什么微软没有实现这样的Skip方法：

    // Not tested... just to show the idea
    public static IEnumerable<T> Skip<T>(this IEnumerable<T> source, int count)
    {
        if (source is IList<T>)
        {
            IList<T> list = (IList<T>)source;
            for (int i = count; i < list.Count; i++)
            {
                yield return list[i];
            }
        }
        else if (source is IList)
        {
            IList list = (IList)source;
            for (int i = count; i < list.Count; i++)
            {
                yield return (T)list[i];
            }
        }
        else
        {
            // .NET framework
            using (IEnumerator<T> e = source.GetEnumerator())
            {
                while (count > 0 && e.MoveNext()) count--;
                if (count <= 0)
                {
                    while (e.MoveNext()) yield return e.Current;
                }
            }
        }
    }

事实上，他们这样做是为了Count方法，例如......

    // .NET Framework...
    public static int Count<TSource>(this IEnumerable<TSource> source) 
    {
        if (source == null) throw Error.ArgumentNull("source");

        ICollection<TSource> collectionoft = source as ICollection<TSource>; 
        if (collectionoft != null) return collectionoft.Count;

        ICollection collection = source as ICollection; 
        if (collection != null) return collection.Count; 

        int count = 0;
        using (IEnumerator<TSource> e = source.GetEnumerator())
        { 
            checked 
            {
                while (e.MoveNext()) count++;
            }
        } 
        return count;
    }

那么是什么原因呢？

score 17 · Accepted Answer

在 Jon Skeet 的重新实现Linq的优秀教程中，他（简要地）讨论了这个问题：

尽管这些操作中的大多数都无法进行明智的优化，但在源实现 IList 时优化 Skip 是有意义的。我们可以跳过跳过，可以这么说，直接进入适当的索引。这不会发现在迭代之间修改源的情况，据我所知，这可能是它没有在框架中实现的原因之一。

这似乎是推迟优化的合理理由，但我同意对于特定情况，如果你能保证你的源不能/不会被修改，那么进行优化可能是值得的。

score 3 · Accepted Answer

正如 ledbutter 所提到的，当 Jon Skeet重新实现 LINQ时，他提到像您这样的优化Skip“不会发现在迭代之间修改源的情况”。您可以将代码更改为以下内容以检查这种情况。它通过调用MoveNext()集合的枚举器来实现这一点，即使它不使用e.Current，所以如果集合发生变化，该方法将抛出。

诚然，这消除了优化的重要部分：需要创建、部分单步执行和处置枚举器，但它仍然具有您不需要无意义地单步执行第一个count对象的好处。并且您有一个e.Current无用的可能会令人困惑，因为它指向list[i - count]而不是list[i].

public static IEnumerable<T> Skip<T>(this IEnumerable<T> source, int count)
{
    using (IEnumerator<T> e = source.GetEnumerator())
    {
        if (source is IList<T>)
        {
            IList<T> list = (IList<T>)source;
            for (int i = count; i < list.Count; i++)
            {
                e.MoveNext();
                yield return list[i];
            }
        }
        else if (source is IList)
        {
            IList list = (IList)source;
            for (int i = count; i < list.Count; i++)
            {
                e.MoveNext();
                yield return (T)list[i];
            }
        }
        else
        {
            // .NET framework
            while (count > 0 && e.MoveNext()) count--;
            if (count <= 0)
            {
                while (e.MoveNext()) yield return e.Current;
            }
        }
    }
}

score 0 · Accepted Answer

我假设他们想InvalidOperationException在另一个线程中同时修改基础集合时抛出“集合已修改......”。你的版本没有这样做。它会产生可怕的结果。

这是 MSFT 在所有非线程安全的集合中在整个 .Net 框架中遵循的标准做法（尽管有些是例外的）。

c# - Skip（和类似功能，如 Take）的性能

3 回答 3

Related

Reference