5

问题:我有 2 种对象,我们称它们BuildingImprovement. 大约有 30 个Improvement实例,而可能有 1-1000Building秒。对于 和 的每种组合BuildingImprovement我必须执行一些繁重的计算,并将结果存储在一个Result对象中。

Buildings 和s都Improvement可以用整数 ID 表示。

然后我需要能够:

  • 访问Result给定BuildingImprovement有效的(编辑:请参阅下面的评论)
  • Result对给定s 的所有Improvements执行聚合Building,例如 .Sum() 和 .Average()
  • 对给定Result的所有s在 s 上执行相同的聚合BuildingImprovement

这将发生在 Web 服务器后端,因此内存可能是一个问题,但速度是最重要的。

思念至今:

  1. 使用Dictionary<Tuple<int, int>, Result>with<BuildingID, ImprovementID>作为键。这应该给我快速的插入和单一的查找,但我担心.Where().Sum()性能。
  2. 使用二维数组,一维代表BuildingIDs,一维代表ImprovementIDs,以及Resultas 值。此外,构建两个Dictionary<int, int>BuildingIDs 和ImprovementIDs 映射到它们各自的数组行/列索引。这可能意味着最大 1000+Dictionary秒,这会是个问题吗?
  3. 使用List<Tuple<int, int, Result>>. 我认为这可能是效率最低的,使用 O(n) 插入,尽管我可能是错的。

我在这里错过了一个明显更好的选择吗?

编辑:原来它只是我感兴趣的聚合值(每Building和每);Improvement看我的回答。

4

3 回答 3

3

一般来说,字典是最有效的查找。当通过键访问时,查找效率和操作效率都是常数 O(1)。这将有助于访问,第一点。

在第二个和第三个中,您需要遍历所有项目 O(n),因此没有办法加快它,除非您想以指定的顺序遍历它们 O(n*n) - 然后您可以使用 SortedDictionray O (n),但您会损害查找和操作效率 (O(log n))。

所以我会选择你发布的第一个解决方案。

于 2013-06-03T07:39:01.377 回答
2

您可以使用“字典字典”来保存结果数据,例如:

//             Building ID ↓               ↓ Improvement ID
var data = new Dictionary<int, Dictionary<int, Result>>();

这将使您快速找到特定建筑物的改进。

但是,要找到包含特定改进的建筑物将需要遍历所有建筑物。这是一些示例代码:

using System;
using System.Linq;
using System.Collections.Generic;

namespace Demo
{
    sealed class Result
    {
        public double Data;
    }

    sealed class Building
    {
        public int Id;
        public int Value;
    }

    sealed class Improvement
    {
        public int Id;
        public int Value;
    }

    class Program
    {
        void run()
        {
            //             Building ID ↓               ↓ Improvement ID
            var data = new Dictionary<int, Dictionary<int, Result>>();

            for (int buildingKey = 1000; buildingKey < 2000; ++buildingKey)
            {
                var improvements = new Dictionary<int, Result>();

                for (int improvementKey = 5000; improvementKey < 5030; ++improvementKey)
                    improvements.Add(improvementKey, new Result{ Data = buildingKey + improvementKey/1000.0 });

                data.Add(buildingKey, improvements);
            }

            // Aggregate data for all improvements for building with ID == 1500:

            int buildingId = 1500;
            var sum = data[buildingId].Sum(result => result.Value.Data);
            Console.WriteLine(sum);

            // Aggregate data for all buildings with a given improvement.

            int improvementId = 5010;

            sum = data.Sum(improvements =>
            {
                Result result;
                return improvements.Value.TryGetValue(improvementId, out result) ? result.Data : 0.0;
            });

            Console.WriteLine(sum);
        }

        static void Main()
        {
            new Program().run();
        }
    }
}

为了加速第二次聚合(用于汇总具有给定 ID 的所有改进的数据),我们可以使用第二个字典:

//                      Improvment ID ↓               ↓ Building ID
var byImprovementId = new Dictionary<int, Dictionary<int, Result>>();

您将需要维护一个额外的字典,但这并不太复杂。拥有一些像这样的嵌套字典可能会占用太多内存 - 但值得考虑。

如以下评论中所述,最好为 ID 和字典本身定义类型。把它放在一起给出:

using System;
using System.Linq;
using System.Collections.Generic;

namespace Demo
{
    sealed class Result
    {
        public double Data;
    }

    sealed class BuildingId
    {
        public BuildingId(int id)
        {
            Id = id;
        }

        public readonly int Id;

        public override int GetHashCode()
        {
            return Id.GetHashCode();
        }

        public override bool Equals(object obj)
        {
            var other = obj as BuildingId;

            if (other == null)
                return false;

            return this.Id == other.Id;
        }
    }

    sealed class ImprovementId
    {
        public ImprovementId(int id)
        {
            Id = id;
        }

        public readonly int Id;

        public override int GetHashCode()
        {
            return Id.GetHashCode();
        }

        public override bool Equals(object obj)
        {
            var other = obj as ImprovementId;

            if (other == null)
                return false;

            return this.Id == other.Id;
        }
    }

    sealed class Building
    {
        public BuildingId Id;
        public int Value;
    }

    sealed class Improvement
    {
        public ImprovementId Id;
        public int Value;
    }

    sealed class BuildingResults : Dictionary<BuildingId, Result>{}

    sealed class ImprovementResults: Dictionary<ImprovementId, Result>{}

    sealed class BuildingsById: Dictionary<BuildingId, ImprovementResults>{}

    sealed class ImprovementsById: Dictionary<ImprovementId, BuildingResults>{}

    class Program
    {
        void run()
        {
            var byBuildingId    = CreateTestBuildingsById();            // Create some test data.
            var byImprovementId = CreateImprovementsById(byBuildingId); // Create the alternative lookup dictionaries.

            // Aggregate data for all improvements for building with ID == 1500:

            BuildingId buildingId = new BuildingId(1500);

            var sum = byBuildingId[buildingId].Sum(result => result.Value.Data);
            Console.WriteLine(sum);

            // Aggregate data for all buildings with a given improvement.

            ImprovementId improvementId = new ImprovementId(5010);

            sum = byBuildingId.Sum(improvements =>
            {
                Result result;
                return improvements.Value.TryGetValue(improvementId, out result) ? result.Data : 0.0;
            });

            Console.WriteLine(sum);

            // Aggregate data for all buildings with a given improvement using byImprovementId.
            // This will be much faster than the above Linq.

            sum = byImprovementId[improvementId].Sum(result => result.Value.Data);
            Console.WriteLine(sum);
        }

        static BuildingsById CreateTestBuildingsById()
        {
            var byBuildingId = new BuildingsById();

            for (int buildingKey = 1000; buildingKey < 2000; ++buildingKey)
            {
                var improvements = new ImprovementResults();

                for (int improvementKey = 5000; improvementKey < 5030; ++improvementKey)
                {
                    improvements.Add
                    (
                        new ImprovementId(improvementKey),
                        new Result
                        {
                            Data = buildingKey + improvementKey/1000.0
                        }
                    );
                }

                byBuildingId.Add(new BuildingId(buildingKey), improvements);
            }

            return byBuildingId;
        }

        static ImprovementsById CreateImprovementsById(BuildingsById byBuildingId)
        {
            var byImprovementId = new ImprovementsById();

            foreach (var improvements in byBuildingId)
            {
                foreach (var improvement in improvements.Value)
                {
                    if (!byImprovementId.ContainsKey(improvement.Key))
                        byImprovementId[improvement.Key] = new BuildingResults();

                    byImprovementId[improvement.Key].Add(improvements.Key, improvement.Value);
                }
            }

            return byImprovementId;
        }

        static void Main()
        {
            new Program().run();
        }
    }
}

最后,这是一个修改后的版本,它确定了为特定改进为构建/改进组合的所有实例聚合数据所需的时间,并将元组字典的结果与字典字典进行比较。

我的 RELEASE 构建结果在任何调试器之外运行:

Dictionary of dictionaries took 00:00:00.2967741
Dictionary of tuples took 00:00:07.8164672

使用字典要快得多,但这仅在您打算进行许多此类聚合时才重要。

using System;
using System.Diagnostics;
using System.Linq;
using System.Collections.Generic;

namespace Demo
{
    sealed class Result
    {
        public double Data;
    }

    sealed class BuildingId
    {
        public BuildingId(int id)
        {
            Id = id;
        }

        public readonly int Id;

        public override int GetHashCode()
        {
            return Id.GetHashCode();
        }

        public override bool Equals(object obj)
        {
            var other = obj as BuildingId;

            if (other == null)
                return false;

            return this.Id == other.Id;
        }
    }

    sealed class ImprovementId
    {
        public ImprovementId(int id)
        {
            Id = id;
        }

        public readonly int Id;

        public override int GetHashCode()
        {
            return Id.GetHashCode();
        }

        public override bool Equals(object obj)
        {
            var other = obj as ImprovementId;

            if (other == null)
                return false;

            return this.Id == other.Id;
        }
    }

    sealed class Building
    {
        public BuildingId Id;
        public int Value;
    }

    sealed class Improvement
    {
        public ImprovementId Id;
        public int Value;
    }

    sealed class BuildingResults : Dictionary<BuildingId, Result>{}

    sealed class ImprovementResults: Dictionary<ImprovementId, Result>{}

    sealed class BuildingsById: Dictionary<BuildingId, ImprovementResults>{}

    sealed class ImprovementsById: Dictionary<ImprovementId, BuildingResults>{}

    class Program
    {
        void run()
        {
            var byBuildingId    = CreateTestBuildingsById();            // Create some test data.
            var byImprovementId = CreateImprovementsById(byBuildingId); // Create the alternative lookup dictionaries.
            var testTuples      = CreateTestTuples();

            ImprovementId improvementId = new ImprovementId(5010);

            int count = 10000;

            Stopwatch sw = Stopwatch.StartNew();

            for (int i = 0; i < count; ++i)
                byImprovementId[improvementId].Sum(result => result.Value.Data);

            Console.WriteLine("Dictionary of dictionaries took " + sw.Elapsed);

            sw.Restart();

            for (int i = 0; i < count; ++i)
                testTuples.Where(result => result.Key.Item2.Equals(improvementId)).Sum(item => item.Value.Data);

            Console.WriteLine("Dictionary of tuples took " + sw.Elapsed);
        }

        static Dictionary<Tuple<BuildingId, ImprovementId>, Result> CreateTestTuples()
        {
            var result = new Dictionary<Tuple<BuildingId, ImprovementId>, Result>();

            for (int buildingKey = 1000; buildingKey < 2000; ++buildingKey)
                for (int improvementKey = 5000; improvementKey < 5030; ++improvementKey)
                    result.Add(
                        new Tuple<BuildingId, ImprovementId>(new BuildingId(buildingKey), new ImprovementId(improvementKey)),
                        new Result
                        {
                            Data = buildingKey + improvementKey/1000.0
                        });

            return result;
        }

        static BuildingsById CreateTestBuildingsById()
        {
            var byBuildingId = new BuildingsById();

            for (int buildingKey = 1000; buildingKey < 2000; ++buildingKey)
            {
                var improvements = new ImprovementResults();

                for (int improvementKey = 5000; improvementKey < 5030; ++improvementKey)
                {
                    improvements.Add
                    (
                        new ImprovementId(improvementKey),
                        new Result
                        {
                            Data = buildingKey + improvementKey/1000.0
                        }
                    );
                }

                byBuildingId.Add(new BuildingId(buildingKey), improvements);
            }

            return byBuildingId;
        }

        static ImprovementsById CreateImprovementsById(BuildingsById byBuildingId)
        {
            var byImprovementId = new ImprovementsById();

            foreach (var improvements in byBuildingId)
            {
                foreach (var improvement in improvements.Value)
                {
                    if (!byImprovementId.ContainsKey(improvement.Key))
                        byImprovementId[improvement.Key] = new BuildingResults();

                    byImprovementId[improvement.Key].Add(improvements.Key, improvement.Value);
                }
            }

            return byImprovementId;
        }

        static void Main()
        {
            new Program().run();
        }
    }
}
于 2013-06-03T08:59:41.520 回答
0

感谢您的回答,测试代码非常有用:)

对我来说,解决方案是放弃 LINQ,在繁重的计算之后直接手动执行聚合,因为无论如何我都必须迭代构建和改进的每个组合。

此外,我必须使用对象本身作为键,以便在对象被持久化到实体框架之前执行计算(即它们的 ID 都是 0)。

代码:

public class Building {
    public int ID { get; set; }
    ...
}

public class Improvement {
    public int ID { get; set; }
    ...
}

public class Result {
    public decimal Foo { get; set; }
    public long Bar { get; set; }
    ...

    public void Add(Result result) {
        Foo += result.Foo;
        Bar += result.Bar;
        ...
    }   
}

public class Calculator {
    public Dictionary<Building, Result> ResultsByBuilding;
    public Dictionary<Improvement, Result> ResultsByImprovement;

    public void CalculateAndAggregate(IEnumerable<Building> buildings, IEnumerable<Improvement> improvements) {
        ResultsByBuilding = new Dictionary<Building, Result>();
        ResultsByImprovement = new Dictionary<Improvement, Result>();
        for (building in buildings) {
            for (improvement in improvements) {
                Result result = DoHeavyCalculation(building, improvement);

                if (ResultsByBuilding.ContainsKey(building)) {
                    ResultsByBuilding[building].Add(result);
                } else {
                    ResultsByBuilding[building] = result;
                }

                if (ResultsByImprovement.ContainsKey(improvement)) {
                    ResultsByImprovement[improvement].Add(result);
                } else {
                    ResultsByImprovement[improvement] = result;
                }
            }
        }
    }
}

public static void Main() {
    var calculator = new Calculator();
    IList<Building> buildings = GetBuildingsFromRepository();
    IList<Improvement> improvements = GetImprovementsFromRepository();
    calculator.CalculateAndAggregate(buildings, improvements);
    DoStuffWithResults(calculator);
}

我这样做是因为我确切地知道我想要哪些聚合;如果我需要一种更动态的方法,我可能会选择@MatthewWatson 的字典之类的东西。

于 2013-06-03T23:43:47.773 回答