141

有人可以建议一种在 linq 中创建一定大小的批次的方法吗?

理想情况下,我希望能够以可配置数量的块执行操作。

4

21 回答 21

145

您无需编写任何代码。使用MoreLINQ Batch 方法,它将源序列批处理成大小合适的存储桶(MoreLINQ 可作为 NuGet 包提供,您可以安装):

int size = 10;
var batches = sequence.Batch(size);

其实现为:

public static IEnumerable<IEnumerable<TSource>> Batch<TSource>(
                  this IEnumerable<TSource> source, int size)
{
    TSource[] bucket = null;
    var count = 0;

    foreach (var item in source)
    {
        if (bucket == null)
            bucket = new TSource[size];

        bucket[count++] = item;
        if (count != size)
            continue;

        yield return bucket;

        bucket = null;
        count = 0;
    }

    if (bucket != null && count > 0)
        yield return bucket.Take(count).ToArray();
}
于 2012-12-05T20:29:25.560 回答
105
public static class MyExtensions
{
    public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> items,
                                                       int maxItems)
    {
        return items.Select((item, inx) => new { item, inx })
                    .GroupBy(x => x.inx / maxItems)
                    .Select(g => g.Select(x => x.item));
    }
}

用法是:

List<int> list = new List<int>() { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };

foreach(var batch in list.Batch(3))
{
    Console.WriteLine(String.Join(",",batch));
}

输出:

0,1,2
3,4,5
6,7,8
9
于 2012-12-05T20:31:03.767 回答
44

如果您从sequence定义为 an开始IEnumerable<T>,并且您知道它可以安全地被多次枚举(例如,因为它是一个数组或列表),您可以使用这个简单的模式来批量处理元素:

while (sequence.Any())
{
    var batch = sequence.Take(10);
    sequence = sequence.Skip(10);

    // do whatever you need to do with each batch here
}
于 2016-12-21T21:23:00.363 回答
31

这是一个完全惰性、低开销、单功能的 Batch 实现,不做任何累加。在EricRoller 的帮助下,基于 Nick Whaley 的解决方案(并修复了其中的问题)。

迭代直接来自底层的 IEnumerable,因此元素必须以严格的顺序枚举,并且访问不超过一次。如果某些元素没有在内部循环中使用,它们将被丢弃(并尝试通过保存的迭代器再次访问它们将抛出InvalidOperationException: Enumeration already finished.)。

您可以在.NET Fiddle测试完整的示例。

public static class BatchLinq
{
    public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> source, int size)
    {
        if (size <= 0)
            throw new ArgumentOutOfRangeException("size", "Must be greater than zero.");
        using (var enumerator = source.GetEnumerator())
            while (enumerator.MoveNext())
            {
                int i = 0;
                // Batch is a local function closing over `i` and `enumerator` that
                // executes the inner batch enumeration
                IEnumerable<T> Batch()
                {
                    do yield return enumerator.Current;
                    while (++i < size && enumerator.MoveNext());
                }

                yield return Batch();
                while (++i < size && enumerator.MoveNext()); // discard skipped items
            }
    }
}
于 2017-06-12T17:26:34.270 回答
30

以上所有方法在大批量或低内存空间的情况下都表现得非常糟糕。必须编写我自己的管道(注意任何地方都没有项目积累):

public static class BatchLinq {
    public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> source, int size) {
        if (size <= 0)
            throw new ArgumentOutOfRangeException("size", "Must be greater than zero.");

        using (IEnumerator<T> enumerator = source.GetEnumerator())
            while (enumerator.MoveNext())
                yield return TakeIEnumerator(enumerator, size);
    }

    private static IEnumerable<T> TakeIEnumerator<T>(IEnumerator<T> source, int size) {
        int i = 0;
        do
            yield return source.Current;
        while (++i < size && source.MoveNext());
    }
}

编辑:这种方法的已知问题是,在移动到下一个批次之前,必须对每个批次进行枚举和完全枚举。例如,这不起作用:

//Select first item of every 100 items
Batch(list, 100).Select(b => b.First())
于 2013-07-11T16:36:54.253 回答
25

.NET 6.0 添加了一个Enumerable.Chunk()扩展方法。

例子:

var list = new List<int> { 1, 2, 3, 4, 5, 6, 7 };

var chunks = list.Chunk(3);
// returns { { 1, 2, 3 }, { 4, 5, 6 }, { 7 } }

对于那些无法升级的人,源代码在 GitHub 上可用

于 2021-06-21T00:34:08.747 回答
13

我想知道为什么没有人发布过老式的 for 循环解决方案。这是一个:

List<int> source = Enumerable.Range(1,23).ToList();
int batchsize = 10;
for (int i = 0; i < source.Count; i+= batchsize)
{
    var batch = source.Skip(i).Take(batchsize);
}

这种简单性是可能的,因为 Take 方法:

... 枚举source并产生元素,直到count元素被产生或source不再包含元素。如果count超过 中的元素个数,则返回source的所有元素source

免责声明:

在循环内使用 Skip 和 Take 意味着可枚举对象将被枚举多次。如果 enumerable 被延迟,这是很危险的。它可能会导致多次执行数据库查询、Web 请求或文件读取。此示例明确用于未延迟的 List 的使用,因此问题较小。它仍然是一个缓慢的解决方案,因为每次调用它时,skip 都会枚举集合。

这也可以使用该GetRange方法解决,但它需要额外的计算来提取可能的剩余批次:

for (int i = 0; i < source.Count; i += batchsize)
{
    int remaining = source.Count - i;
    var batch = remaining > batchsize  ? source.GetRange(i, batchsize) : source.GetRange(i, remaining);
}

这是处理这个问题的第三种方法,它适用于 2 个循环。这确保集合仅被枚举 1 次!:

int batchsize = 10;
List<int> batch = new List<int>(batchsize);

for (int i = 0; i < source.Count; i += batchsize)
{
    // calculated the remaining items to avoid an OutOfRangeException
    batchsize = source.Count - i > batchsize ? batchsize : source.Count - i;
    for (int j = i; j < i + batchsize; j++)
    {
        batch.Add(source[j]);
    }           
    batch.Clear();
}
于 2019-06-27T08:37:49.870 回答
4

这是 Nick Whaley 的 ( link ) 和 infogulch 的 ( link ) 惰性Batch实现的尝试改进。这个很严格。您要么以正确的顺序枚举批次,要么得到异常。

public static IEnumerable<IEnumerable<TSource>> Batch<TSource>(
    this IEnumerable<TSource> source, int size)
{
    if (size <= 0) throw new ArgumentOutOfRangeException(nameof(size));
    using (var enumerator = source.GetEnumerator())
    {
        int i = 0;
        while (enumerator.MoveNext())
        {
            if (i % size != 0) throw new InvalidOperationException(
                "The enumeration is out of order.");
            i++;
            yield return GetBatch();
        }
        IEnumerable<TSource> GetBatch()
        {
            while (true)
            {
                yield return enumerator.Current;
                if (i % size == 0 || !enumerator.MoveNext()) break;
                i++;
            }
        }
    }
}

这是Batchtype 来源的惰性实现IList<T>。这对枚举没有任何限制。批次可以按任何顺序部分列举,也可以多次列举。尽管如此,在枚举期间不修改集合的限制仍然存在。这是通过enumerator.MoveNext()在产生任何块或元素之前进行虚拟调用来实现的。不利的一面是枚举器未处理,因为不知道枚举何时结束。

public static IEnumerable<IEnumerable<TSource>> Batch<TSource>(
    this IList<TSource> source, int size)
{
    if (size <= 0) throw new ArgumentOutOfRangeException(nameof(size));
    var enumerator = source.GetEnumerator();
    for (int i = 0; i < source.Count; i += size)
    {
        enumerator.MoveNext();
        yield return GetChunk(i, Math.Min(i + size, source.Count));
    }
    IEnumerable<TSource> GetChunk(int from, int toExclusive)
    {
        for (int j = from; j < toExclusive; j++)
        {
            enumerator.MoveNext();
            yield return source[j];
        }
    }
}
于 2019-07-26T16:48:56.133 回答
3

与 MoreLINQ 相同的方法,但使用 List 而不是 Array。我还没有做过基准测试,但可读性对某些人来说更重要:

    public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> source, int size)
    {
        List<T> batch = new List<T>();

        foreach (var item in source)
        {
            batch.Add(item);

            if (batch.Count >= size)
            {
                yield return batch;
                batch.Clear();
            }
        }

        if (batch.Count > 0)
        {
            yield return batch;
        }
    }
于 2016-05-04T19:04:05.333 回答
3

Batch这是我能想到的最干净的版本:

public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> source, int count)
{
    if (source == null) throw new System.ArgumentNullException("source");
    if (count <= 0) throw new System.ArgumentOutOfRangeException("count");
    using (var enumerator = source.GetEnumerator())
    {
        IEnumerable<T> BatchInner()
        {
            int counter = 0;
            do
                yield return enumerator.Current;
            while (++counter < count && enumerator.MoveNext());
        }
        while (enumerator.MoveNext())
            yield return BatchInner().ToArray();
    }
}

使用此代码:

Console.WriteLine(String.Join(Environment.NewLine,
    Enumerable.Range(0, 20).Batch(8).Select(xs => String.Join(",", xs))));

我得到:

0,1,2,3,4,5,6,7
8,9,10,11,12,13,14,15
16,17,18,19

请务必注意,在“”和“”的答案中,此代码失败:

var e = Enumerable.Range(0, 20).Batch(8).ToArray();

Console.WriteLine(String.Join(Environment.NewLine, e.Select(xs => String.Join(",", xs))));
Console.WriteLine();
Console.WriteLine(String.Join(Environment.NewLine, e.Select(xs => String.Join(",", xs))));

根据他们的回答,它给出了:

19
19
19

19
19
19

由于内部可枚举没有被计算为数组。

于 2021-08-24T06:42:36.323 回答
2

因此,戴上功能性帽子,这似乎微不足道....但在 C# 中,有一些明显的缺点。

你可能会认为这是 IEnumerable 的展开(谷歌它,你可能会在一些 Haskell 文档中结束,但可能有一些 F# 的东西使用展开,如果你知道 F#,眯着眼睛看 Haskell 文档,它会让感觉)。

展开与折叠(“聚合”)相关,除了不是遍历输入 IEnumerable,而是遍历输出数据结构(IEnumerable 和 IObservable 之间的关系类似,实际上我认为 IObservable 确实实现了一个称为生成的“展开”。 ..)

无论如何,首先你需要一个展开方法,我认为这是可行的(不幸的是,它最终会为大型“列表”炸毁堆栈......你可以在 F# 中使用 yield! ​​而不是 concat 安全地编写它);

    static IEnumerable<T> Unfold<T, U>(Func<U, IEnumerable<Tuple<U, T>>> f, U seed)
    {
        var maybeNewSeedAndElement = f(seed);

        return maybeNewSeedAndElement.SelectMany(x => new[] { x.Item2 }.Concat(Unfold(f, x.Item1)));
    }

这有点迟钝,因为 C# 没有实现一些功能性语言认为理所当然的东西......但它基本上需要一个种子,然后生成 IEnumerable 和下一个种子中下一个元素的“也许”答案(也许在 C# 中不存在,所以我们使用 IEnumerable 来伪造它),并连接其余的答案(我不能保证“O(n?)”的复杂性)。

一旦你这样做了;

    static IEnumerable<IEnumerable<T>> Batch<T>(IEnumerable<T> xs, int n)
    {
        return Unfold(ys =>
            {
                var head = ys.Take(n);
                var tail = ys.Skip(n);
                return head.Take(1).Select(_ => Tuple.Create(tail, head));
            },
            xs);
    }

这一切看起来都很干净......您将“n”元素作为 IEnumerable 中的“下一个”元素,而“尾部”是未处理列表的其余部分。

如果头部什么都没有……你就结束了……你返回“Nothing”(但伪装成一个空的 IEnumerable>)……否则你返回头部元素和尾部以进行处理。

您可能可以使用 IObservable 来执行此操作,可能已经存在类似“批处理”的方法,您可能可以使用它。

如果堆栈溢出的风险令人担忧(它可能应该),那么您应该在 F# 中实现(并且可能已经有一些 F# 库(FSharpX?))。

(我只对此进行了一些基本测试,因此可能存在一些奇怪的错误)。

于 2018-04-16T10:12:10.210 回答
1

我很晚才加入,但我发现了一些更有趣的东西。

所以我们可以在这里使用SkipTake获得更好的性能。

public static class MyExtensions
    {
        public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> items, int maxItems)
        {
            return items.Select((item, index) => new { item, index })
                        .GroupBy(x => x.index / maxItems)
                        .Select(g => g.Select(x => x.item));
        }

        public static IEnumerable<T> Batch2<T>(this IEnumerable<T> items, int skip, int take)
        {
            return items.Skip(skip).Take(take);
        }

    }

接下来我检查了 100000 条记录。循环仅在以下情况下需要更多时间Batch

控制台应用程序的代码。

static void Main(string[] args)
{
    List<string> Ids = GetData("First");
    List<string> Ids2 = GetData("tsriF");

    Stopwatch FirstWatch = new Stopwatch();
    FirstWatch.Start();
    foreach (var batch in Ids2.Batch(5000))
    {
        // Console.WriteLine("Batch Ouput:= " + string.Join(",", batch));
    }
    FirstWatch.Stop();
    Console.WriteLine("Done Processing time taken:= "+ FirstWatch.Elapsed.ToString());


    Stopwatch Second = new Stopwatch();

    Second.Start();
    int Length = Ids2.Count;
    int StartIndex = 0;
    int BatchSize = 5000;
    while (Length > 0)
    {
        var SecBatch = Ids2.Batch2(StartIndex, BatchSize);
        // Console.WriteLine("Second Batch Ouput:= " + string.Join(",", SecBatch));
        Length = Length - BatchSize;
        StartIndex += BatchSize;
    }

    Second.Stop();
    Console.WriteLine("Done Processing time taken Second:= " + Second.Elapsed.ToString());
    Console.ReadKey();
}

static List<string> GetData(string name)
{
    List<string> Data = new List<string>();
    for (int i = 0; i < 100000; i++)
    {
        Data.Add(string.Format("{0} {1}", name, i.ToString()));
    }

    return Data;
}

所用时间是这样的。

首先 - 00:00:00.0708 , 00:00:00.0660

第二个(接受并跳过一个) - 00:00:00.0008、00:00:00.0008

于 2016-04-12T06:57:20.003 回答
1

我编写了一个自定义的 IEnumerable 实现,它在没有 linq 的情况下工作,并保证对数据进行单一枚举。它还可以完成所有这些,而不需要在大型数据集上导致内存爆炸的后备列表或数组。

以下是一些基本测试:

    [Fact]
    public void ShouldPartition()
    {
        var ints = new List<int> {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};
        var data = ints.PartitionByMaxGroupSize(3);
        data.Count().Should().Be(4);

        data.Skip(0).First().Count().Should().Be(3);
        data.Skip(0).First().ToList()[0].Should().Be(0);
        data.Skip(0).First().ToList()[1].Should().Be(1);
        data.Skip(0).First().ToList()[2].Should().Be(2);

        data.Skip(1).First().Count().Should().Be(3);
        data.Skip(1).First().ToList()[0].Should().Be(3);
        data.Skip(1).First().ToList()[1].Should().Be(4);
        data.Skip(1).First().ToList()[2].Should().Be(5);

        data.Skip(2).First().Count().Should().Be(3);
        data.Skip(2).First().ToList()[0].Should().Be(6);
        data.Skip(2).First().ToList()[1].Should().Be(7);
        data.Skip(2).First().ToList()[2].Should().Be(8);

        data.Skip(3).First().Count().Should().Be(1);
        data.Skip(3).First().ToList()[0].Should().Be(9);
    }

对数据进行分区的扩展方法。

/// <summary>
/// A set of extension methods for <see cref="IEnumerable{T}"/>. 
/// </summary>
public static class EnumerableExtender
{
    /// <summary>
    /// Splits an enumerable into chucks, by a maximum group size.
    /// </summary>
    /// <param name="source">The source to split</param>
    /// <param name="maxSize">The maximum number of items per group.</param>
    /// <typeparam name="T">The type of item to split</typeparam>
    /// <returns>A list of lists of the original items.</returns>
    public static IEnumerable<IEnumerable<T>> PartitionByMaxGroupSize<T>(this IEnumerable<T> source, int maxSize)
    {
        return new SplittingEnumerable<T>(source, maxSize);
    }
}

这是实现类

    using System.Collections;
    using System.Collections.Generic;

    internal class SplittingEnumerable<T> : IEnumerable<IEnumerable<T>>
    {
        private readonly IEnumerable<T> backing;
        private readonly int maxSize;
        private bool hasCurrent;
        private T lastItem;

        public SplittingEnumerable(IEnumerable<T> backing, int maxSize)
        {
            this.backing = backing;
            this.maxSize = maxSize;
        }

        public IEnumerator<IEnumerable<T>> GetEnumerator()
        {
            return new Enumerator(this, this.backing.GetEnumerator());
        }

        IEnumerator IEnumerable.GetEnumerator()
        {
            return this.GetEnumerator();
        }

        private class Enumerator : IEnumerator<IEnumerable<T>>
        {
            private readonly SplittingEnumerable<T> parent;
            private readonly IEnumerator<T> backingEnumerator;
            private NextEnumerable current;

            public Enumerator(SplittingEnumerable<T> parent, IEnumerator<T> backingEnumerator)
            {
                this.parent = parent;
                this.backingEnumerator = backingEnumerator;
                this.parent.hasCurrent = this.backingEnumerator.MoveNext();
                if (this.parent.hasCurrent)
                {
                    this.parent.lastItem = this.backingEnumerator.Current;
                }
            }

            public bool MoveNext()
            {
                if (this.current == null)
                {
                    this.current = new NextEnumerable(this.parent, this.backingEnumerator);
                    return true;
                }
                else
                {
                    if (!this.current.IsComplete)
                    {
                        using (var enumerator = this.current.GetEnumerator())
                        {
                            while (enumerator.MoveNext())
                            {
                            }
                        }
                    }
                }

                if (!this.parent.hasCurrent)
                {
                    return false;
                }

                this.current = new NextEnumerable(this.parent, this.backingEnumerator);
                return true;
            }

            public void Reset()
            {
                throw new System.NotImplementedException();
            }

            public IEnumerable<T> Current
            {
                get { return this.current; }
            }

            object IEnumerator.Current
            {
                get { return this.Current; }
            }

            public void Dispose()
            {
            }
        }

        private class NextEnumerable : IEnumerable<T>
        {
            private readonly SplittingEnumerable<T> splitter;
            private readonly IEnumerator<T> backingEnumerator;
            private int currentSize;

            public NextEnumerable(SplittingEnumerable<T> splitter, IEnumerator<T> backingEnumerator)
            {
                this.splitter = splitter;
                this.backingEnumerator = backingEnumerator;
            }

            public bool IsComplete { get; private set; }

            public IEnumerator<T> GetEnumerator()
            {
                return new NextEnumerator(this.splitter, this, this.backingEnumerator);
            }

            IEnumerator IEnumerable.GetEnumerator()
            {
                return this.GetEnumerator();
            }

            private class NextEnumerator : IEnumerator<T>
            {
                private readonly SplittingEnumerable<T> splitter;
                private readonly NextEnumerable parent;
                private readonly IEnumerator<T> enumerator;
                private T currentItem;

                public NextEnumerator(SplittingEnumerable<T> splitter, NextEnumerable parent, IEnumerator<T> enumerator)
                {
                    this.splitter = splitter;
                    this.parent = parent;
                    this.enumerator = enumerator;
                }

                public bool MoveNext()
                {
                    this.parent.currentSize += 1;
                    this.currentItem = this.splitter.lastItem;
                    var hasCcurent = this.splitter.hasCurrent;

                    this.parent.IsComplete = this.parent.currentSize > this.splitter.maxSize;

                    if (this.parent.IsComplete)
                    {
                        return false;
                    }

                    if (hasCcurent)
                    {
                        var result = this.enumerator.MoveNext();

                        this.splitter.lastItem = this.enumerator.Current;
                        this.splitter.hasCurrent = result;
                    }

                    return hasCcurent;
                }

                public void Reset()
                {
                    throw new System.NotImplementedException();
                }

                public T Current
                {
                    get { return this.currentItem; }
                }

                object IEnumerator.Current
                {
                    get { return this.Current; }
                }

                public void Dispose()
                {
                }
            }
        }
    }
于 2017-12-02T23:10:34.127 回答
1

另一种方法是使用Rx Buffer 运算符

//using System.Linq;
//using System.Reactive.Linq;
//using System.Reactive.Threading.Tasks;

var observableBatches = anAnumerable.ToObservable().Buffer(size);

var batches = aList.ToObservable().Buffer(size).ToList().ToTask().GetAwaiter().GetResult();
于 2018-11-28T07:22:56.990 回答
1

只是另一个单行实现。它甚至适用于空列表,在这种情况下,您将获得零大小的批次集合。

var aList = Enumerable.Range(1, 100).ToList(); //a given list
var size = 9; //the wanted batch size
//number of batches are: (aList.Count() + size - 1) / size;

var batches = Enumerable.Range(0, (aList.Count() + size - 1) / size).Select(i => aList.GetRange( i * size, Math.Min(size, aList.Count() - i * size)));

Assert.True(batches.Count() == 12);
Assert.AreEqual(batches.ToList().ElementAt(0), new List<int>() { 1, 2, 3, 4, 5, 6, 7, 8, 9 });
Assert.AreEqual(batches.ToList().ElementAt(1), new List<int>() { 10, 11, 12, 13, 14, 15, 16, 17, 18 });
Assert.AreEqual(batches.ToList().ElementAt(11), new List<int>() { 100 });
于 2018-11-28T09:18:16.157 回答
1

一个易于使用和理解的版本。

    public static List<List<T>> chunkList<T>(List<T> listToChunk, int batchSize)
    {
        List<List<T>> batches = new List<List<T>>();

        if (listToChunk.Count == 0) return batches;

        bool moreRecords = true;
        int fromRecord = 0;
        int countRange = 0;
        if (listToChunk.Count >= batchSize)
        {
            countRange = batchSize;
        }
        else
        {
            countRange = listToChunk.Count;
        }

        while (moreRecords)
        {
            List<T> batch = listToChunk.GetRange(fromRecord, countRange);
            batches.Add(batch);

            if ((fromRecord + batchSize) >= listToChunk.Count)
            {
                moreRecords = false;
            }

            fromRecord = fromRecord + batch.Count;

            if ((fromRecord + batchSize) > listToChunk.Count)
            {
                countRange = listToChunk.Count - fromRecord;
            }
            else
            {
                countRange = batchSize;
            }
        }
        return batches;
    }
于 2020-10-28T06:19:20.603 回答
1

作为 .NET 6 中 LINQ 的新辅助方法,您可以将任何 IEnumerable 分块:

int chunkNumber = 1;
foreach (int[] chunk in Enumerable.Range(0, 9).Chunk(3))
{
    Console.WriteLine($"Chunk {chunkNumber++}");
    foreach (var item in chunk)
    {
        Console.WriteLine(item);
    }
}
于 2021-08-24T04:48:16.790 回答
1

IAsyncEnumerable这是一个在 C# 中通过- https://docs.microsoft.com/en-us/dotnet/csharp/whats-new/tutorials/generate-consume-asynchronous-stream使用异步迭代的实现

public static class EnumerableExtensions
{
    /// <summary>
    /// Chunks a sequence into a sub-sequences each containing maxItemsPerChunk, except for the last
    /// which will contain any items left over.
    ///
    /// NOTE: this implements a streaming implementation via <seealso cref="IAsyncEnumerable{T}"/>.
    /// </summary>
    public static async IAsyncEnumerable<IEnumerable<T>> ChunkAsync<T>(this IAsyncEnumerable<T> sequence, int maxItemsPerChunk)
    {
        if (sequence == null) throw new ArgumentNullException(nameof(sequence));
        if (maxItemsPerChunk <= 0)
        {
            throw new ArgumentOutOfRangeException(nameof(maxItemsPerChunk), $"{nameof(maxItemsPerChunk)} must be greater than 0");
        }

        var chunk = new List<T>(maxItemsPerChunk);
        await foreach (var item in sequence)
        {
            chunk.Add(item);

            if (chunk.Count == maxItemsPerChunk)
            {
                yield return chunk.ToArray();
                chunk.Clear();
            }
        }

        // return the "crumbs" that 
        // didn't make it into a full chunk
        if (chunk.Count > 0)
        {
            yield return chunk.ToArray();
        }
    }

    /// <summary>
    /// Chunks a sequence into a sub-sequences each containing maxItemsPerChunk, except for the last
    /// which will contain any items left over.
    /// </summary>
    public static IEnumerable<IEnumerable<T>> Chunk<T>(this IEnumerable<T> sequence, int maxItemsPerChunk)
    {
        if (sequence == null) throw new ArgumentNullException(nameof(sequence));
        if (maxItemsPerChunk <= 0)
        {
            throw new ArgumentOutOfRangeException(nameof(maxItemsPerChunk), $"{nameof(maxItemsPerChunk)} must be greater than 0");
        }

        var chunk = new List<T>(maxItemsPerChunk);
        foreach (var item in sequence)
        {
            chunk.Add(item);

            if (chunk.Count == maxItemsPerChunk)
            {
                yield return chunk.ToArray();
                chunk.Clear();
            }
        }

        // return the "crumbs" that 
        // didn't make it into a full chunk
        if (chunk.Count > 0)
        {
            yield return chunk.ToArray();
        }
    }
}
于 2021-10-20T18:26:03.427 回答
0

我知道每个人都使用复杂的系统来完成这项工作,我真的不明白为什么。Func<TSource,Int32,TResult>Take 和 skip 将允许使用带有转换功能的公共选择的所有这些操作。像:

public IEnumerable<IEnumerable<T>> Buffer<T>(IEnumerable<T> source, int size)=>
    source.Select((item, index) => source.Skip(size * index).Take(size)).TakeWhile(bucket => bucket.Any());
于 2018-10-03T04:03:48.093 回答
0

另一种执行批处理的方法:

public static class Extensions
{
    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;

                yield return func(v0, v1);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;

                yield return func(v0, v1, v2);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;

                yield return func(v0, v1, v2, v3);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v7 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6, v7);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v7 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v8 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6, v7, v8);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v7 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v8 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v9 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6, v7, v8, v9);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v7 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v8 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v9 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v10 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v7 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v8 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v9 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v10 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v11 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v7 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v8 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v9 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v10 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v11 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v12 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v7 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v8 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v9 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v10 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v11 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v12 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v13 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v7 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v8 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v9 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v10 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v11 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v12 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v13 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v14 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13, v14);
            }
        }
    }

    public static IEnumerable<TOut> Batch<T, TOut>(this IEnumerable<T> source, Func<T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, TOut> func)
    {
        using (var enumerator = source.GetEnumerator())
        {
            while (true)
            {
                bool state;

                state = enumerator.MoveNext(); if (!state) break; var v0 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v1 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v2 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v3 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v4 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v5 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v6 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v7 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v8 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v9 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v10 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v11 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v12 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v13 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v14 = enumerator.Current;
                state = enumerator.MoveNext(); if (!state) break; var v15 = enumerator.Current;

                yield return func(v0, v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13, v14, v15);
            }
        }
    }
}

这是一个示例用法:

using System;
using System.Linq;


namespace TestProgram
{
    class Program
    {
        static void Main(string[] args)
        {
            foreach (var item in Enumerable.Range(0, 12).ToArray().Batch((R, X1, Y1, X2, Y2) => (R, X1, Y1, X2, Y2)))
            {
                Console.WriteLine($"{item.R}, {item.X1}, {item.Y1}, {item.X2}, {item.Y2}");
            }
        }
    }
}
于 2021-10-05T22:30:32.080 回答
-3
    static IEnumerable<IEnumerable<T>> TakeBatch<T>(IEnumerable<T> ts,int batchSize)
    {
        return from @group in ts.Select((x, i) => new { x, i }).ToLookup(xi => xi.i / batchSize)
               select @group.Select(xi => xi.x);
    }
于 2015-07-02T16:33:10.067 回答