0

我们需要找到几个整数排序数组的交集。这是示例:

例子:

Input:
1,3,7,8
2,3,8,10
3,10,11,12,13,14

minSupport = 1

Output:

1 and 2: 2, 8
1 and 3: 3
2 and 3: 3, 10

我写了算法,它运行得很快。

    var minSupport = 2;
    var elementsCount = 10000;
    var random = new Random(123);

    // Numbers of each array are unique
    var sortedArrays = Enumerable.Range(0,elementsCount)
    .Select(x => Enumerable.Range(0,30).Select(t => random.Next(1000)).Distinct()
    .ToList()).ToList();
    var result = new List<int[]>();
    var resultIntersection = new List<List<int>>();


    foreach (var array in sortedArrays)
    {
        array.Sort();
    }



    var sw = Stopwatch.StartNew();

    //****MAIN PART*****//

    // This number(max value which array can contains) is known. 
    // Ofcourse we can use dictionary if donnt know maxValue
    var maxValue = 1000;

    var reverseIndexDict = new List<int>[maxValue];

    for (int i = 0; i < maxValue; i++)
    {
        reverseIndexDict[i] = new List<int>();
    }

    for (int i = 0; i < sortedArrays.Count; i++)
    {
        for (int j = 0; j < sortedArrays[i].Count; j++)
        {
            reverseIndexDict[sortedArrays[i][j]].Add(i);
        }
    }


    var resultMatrix = new List<int>[sortedArrays.Count,sortedArrays.Count];

    for (int i = 0; i < sortedArrays.Count; i++)
    {   
        for (int j = 0; j < sortedArrays[i].Count; j++)
        {
            var sortedArraysij = sortedArrays[i][j];

            for (int k = 0; k < reverseIndexDict[sortedArraysij].Count; k++)
            {
                if(resultMatrix[i,reverseIndexDict[sortedArraysij][k]]==null) resultMatrix[i,reverseIndexDict[sortedArraysij][k]] = new List<int>();

                resultMatrix[i,reverseIndexDict[sortedArraysij][k]].Add(sortedArraysij);    

            }
        }
    }


    //*****************//

    sw.Stop();

    Console.WriteLine(sw.Elapsed);

但是当元素计数超过 10000 时,我的代码会因 outofmemoryException 而下降。我该如何改进我的算法或我能做些什么来解决这个问题?

4

2 回答 2

0

如果您知道数组可以拥有的最大整数,您可以执行以下操作:

var histoMatrix = new int[1000]; // the max number in arrays is 1000 here

for (int i = 0; i < sortedArrays.Count; i++)
{   
    for (int j = 0; j < sortedArrays[i].Count; j++)
    {
        var sortedArraysij = sortedArrays[i][j];

        histoMatrix[sortedArraysij]++;
    }
}

var resultMatrix = new List<int>();

for (int i = 0; i < 1000; i++)
{
    if (histoMatrix[i] == sortedArrays.Count)
        resultMatrix.Add(histoMatrix[i]);
}

在这种情况下,您甚至不需要对数组进行排序。

希望能帮助到你

于 2012-06-07T16:02:18.093 回答
0

使用这样的Distinct方法:

...
var theDistinctListOfInts = new List<int>();
foreach(var listOfInts in theListsOfInts)
{
    theDistinctListOfInts = theDistinctListOfInts.Intersect(listOfInts);
}
...
于 2012-06-07T02:38:52.563 回答