2

我对 LINQ 和 where 语句有疑问。我有以下代码示例(这是我在应用程序中使用的代码的简化版本):

// Get the images from a datasource.
var images = GetImages(); // returns IEnumerable<Image>

// I continue processing images until everything has been processed.
while (images.Any())
{
    // I'm going to determine what kind of image it is and do some actions with it.
    var image = images.First();

    // Suddenly in my process I'm going to add a where statement to my images collection to fetch all images that matches the specified criteria.
    // It can happen (if my images collection is not empty) that the same where statement will be executed again to the images collection.
    // This is also where the problem is, somehow when I don't add the ToList() extension method, my linq statement is becoming slow, really slow.
    // When I add the ToList() extension method, why is my linq statement running fast again?
    var saveImages = images.Where(<criteria>); //.ToList() this is needed to make my LINQ query performant again.

    // I'm going to do something with these save images and then I'm going to remove these save images from the current images collection because I do not need to do these anymore by using the following statement.
    images = images.Except(saveImages);
}

正如代码示例所解释的,当我添加 ToList() 扩展方法时,为什么我的 LINQ 语句又变快了。为什么我不能只使用 Where 语句,因为它返回一个 IEnumerable 集合?

我真的很困惑,我希望有人可以向我解释:)。

4

2 回答 2

5

当你通过循环时,你的images第一个变成这个

images.Except(firstSetOfExclusions)

那么这个

images.Except(firstSetOfExclusions).Except(secondSetOfExclusions)

那么这个

images.Except(firstSetOfExclusions).Except(secondSetOfExclusions).Except(thirdSetOfExclusions)

等等。缓慢的原因是除非您调用ToList,否则每组排除项都必须执行一个新查询。随着循环的每次迭代,这变得越来越慢,因为它一遍又一遍地执行相同的查询。ToList通过在内存中“实现”查询来解决这个问题。

请注意,此问题的另一种解决方案是“物化”新的图像子集,如下所示:

images = images.Except(saveImages).ToList();

这将避免链接“except” ,因此您不必调用ToList.saveImages

于 2013-05-08T10:29:36.207 回答
3

如果我们重新实现 LINQ-to-Objects 以显示方法,也许会更有意义;这是我们的Main

static void Main()
{
    Log();
    IEnumerable<int> data = GetData();

    while (data.Any())
    {
        var value = data.First();
        Console.WriteLine("\t\tFound:{0}", value);
        var found = data.Where(i => i == value);
        data = data.Except(found);
    }
}
static IEnumerable<int> GetData()
{
    Log();
    return new[] { 1, 2, 3, 4, 5 };
}

看起来很无辜,是吗?现在运行它记录输出(底部显示的 LINQ 方法) - 我们得到:

Main
GetData
Any
First
                Found:1
Any
Except
Where
First
Except
Where
                Found:2
Any
Except
Where
Except
Where
Except
Where
First
Except
Where
Except
Where
Except
Where
                Found:3
Any
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
First
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
                Found:4
Any
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
First
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
                Found:5
Any
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where
Except
Where

注意每个项目之间的复杂性如何增长?

对于奖励积分,制作GetData一个迭代器块 - 看看执行了多少次GetData

static IEnumerable<int> GetData()
{
    Log();
    yield return 1;
    yield return 2;
    yield return 3;
    yield return 4;
    yield return 5;
}

我做了94 次(而不是原始版本中的一次)。好玩吧?

这不是 LINQ 的错 - 这是因为您使用 LINQ 真的很奇怪。对于您正在做的事情,最好使用平面集合(List<T>),根据需要添加和删除项目。

这是LINQ:

public static bool Any<T>(this IEnumerable<T> data)
{
    Log();
    using (var iter = data.GetEnumerator())
    {
        return iter.MoveNext();
    }
}
static void Log([CallerMemberName] string name = null)
{
    Console.WriteLine(name);
}
public static T First<T>(this IEnumerable<T> data)
{
    Log();
    using (var iter = data.GetEnumerator())
    {
        if (iter.MoveNext()) return iter.Current;
        throw new InvalidOperationException();
    }
}
public static IEnumerable<T> Where<T>(this IEnumerable<T> data, Func<T,bool> predicate)
{
    Log();
    foreach (var item in data) if (predicate(item)) yield return item;
}
public static IEnumerable<T> Except<T>(this IEnumerable<T> data, IEnumerable<T> except)
{
    Log();
    var exclude = new HashSet<T>(except);
    foreach (var item in data)
    {
        if (!exclude.Contains(item)) yield return item;
    }
}
于 2013-05-08T11:33:57.923 回答