5

linq 的优点之一是可以根据请求延迟处理无限数据源。我尝试并行化我的查询,发现延迟加载不起作用。例如...

class Program
{
    static void Main(string[] args)
    {
        var source = Generator();
        var next = source.AsParallel().Select(i => ExpensiveCall(i));
        foreach (var i in next)
        {
            System.Console.WriteLine(i);
        }
    }

    public static IEnumerable<int> Generator()
    {
        int i = 0;
        while (true)
        {
            yield return i;
            i++;
        }
    }

    public static int ExpensiveCall(int arg)
    {
        System.Threading.Thread.Sleep(5000);
        return arg*arg;
    }
}

该程序无法产生任何结果,大概是因为在每一步,它都在等待对生成器的所有调用都干涸,这当然不会。如果我取出“AsParallel”调用,它工作得很好。那么,如何在使用 PLINQ 提高应用程序性能的同时获得出色的延迟加载呢?

4

2 回答 2

5

看看MergeOptions

 var next = source.AsParallel()
              .WithMergeOptions(ParallelMergeOptions.NotBuffered)
              .Select(i => ExpensiveCall(i));
于 2013-02-04T04:28:44.013 回答
3

我认为你混淆了两个不同的东西。这里的问题不是延迟加载(即只加载必要的量),这里的问题是输出缓冲(即不立即返回结果)。

In your case, you will get your results eventually, although it might take a while (for me, it requires something like 500 results for it to return the first batch). The buffering is done for performance reasons, but in your case, that doesn't make sense. As Ian correctly pointed out, you should use .WithMergeOptions(ParallelMergeOptions.NotBuffered) to disable output buffering.

But, as far as I know, PLINQ doesn't do lazy loading and there is no way to change that. What that means is that if your consumer (in your case, the foreach loop) is too slow, PLINQ will generate results faster than necessary and it will stop only when you finish iterating the results. This means PLINQ can be wasting CPU time and memory.

于 2013-02-04T12:59:18.127 回答