-1

可能重复:
比较两个列表的差异

我有以下一组数组

string[] arr1 = { 1155717, 5184305, 2531291, 1676341, 1916805 ... } 
string[] arr2 = { 1155717, 1440230, 2531291, 8178626, 1916805 ... }
string[] arr3 = { 1155717, 5184305, 4025514, 1676341, ... }

数组计数以百万计,也可以包含字符。我想在 csv 中创建这样的报告

diff.csv

arr1,arr2,arr3
1155717,1155717,1155717
5184305,--N/A--,5184305
--N/A--,1440230,--N/A--
--N/A--,--N/A--,4025514
1676341,--N/A--,1676341
--N/A--,8178626,--N/A--
1916805,1916805,--N/A--

我想在每个和比较中申请循环并不是那么好的方法。有什么想法吗?
我错过了几件事:
1.订单无关紧要。
2. 单个列表中的元素将是唯一的。
3. 我计划尽可能地跳过循环并在 LINQ/Generics 中寻找 .NET 3.5 / 4.0 的新功能,我可以在这里应用!

对于那些投反对票或结束这个问题的人,请解释一下?

4

4 回答 4

2

我用 int 类型的数组做了一个小例子,但这可以应用于字符串

        int[] arr1 = { 1155717, 5184305, 2531291, 1676341, 1916805 } ;
        int[] arr2 = { 1155717, 1440230, 2531291, 8178626, 1916805 };
        int[] arr3 = { 1155717, 5184305, 4025514, 1676341 };

        foreach (int i in arr1)
        {
            Console.Write(i + "  ");
            foreach (int b in arr2)
            {
                if (i == b)
                    Console.Write(b + "  ");

            }
            foreach (int c in arr3)
            {
                if (i == c)
                    Console.Write(c + "  ");
            }
            Console.WriteLine();
        }
        Console.ReadLine();

唯一的问题是您在循环中使用循环,因此如果您的数组很大,那么您的性能将受到影响。这只是一个让你思考的简单想法。

于 2012-12-19T13:42:39.977 回答
2

您可以使用此 Linq 查询和string.Join

string[][] all = new[] { arr1, arr2, arr3 };
int maxLength = all.Max(arr => arr.Length);
string separator = ",";
string defaultValue = "N/A";

var csvFields = all.Select(arr => Enumerable.Range(0, maxLength)
                   .Select(i => arr.Length <= i ? defaultValue : arr[i]));
string csv = string.Join(Environment.NewLine, 
                        csvFields.Select(f => string.Join(separator, f)));
File.WriteAllText(path, csv);  

演示

我将所有数组放在一个锯齿状数组中。然后我使用一个int范围作为起点(样本中的 0-4,因为最大的数组有 5 个元素)。然后我从每个数组中获取 5 个元素,"N/A"如果数组小于该索引,则使用默认值。

最后一个阶段是使用分隔符 ( )链接每个数组string.Join所有部分,并使用.","Environment.NewLine

于 2012-12-19T13:50:53.837 回答
1

我会在 O(4N) 或其他情况下做这样的事情,但也许有人知道更快的方法。

private void PrintDiff()
{
        public Dictionary<string, Model> dictionary = new Dictionary<string, Model>();

        foreach (var entry in Array1)
        {
            dictionary.Add(entry, (new Model()).Add(entry, "Array1"));
        }
        foreach (var entry in Array2)
        {
            if (!dictionary.ContainsValue(entry))
                 dictionary.Add(entry, (new Model()).Add(entry, "Array2"));
        }
        foreach (var entry in Array3)
        {
            if (!dictionary.ContainsValue(entry))
                 dictionary.Add(entry, (new Model()).Add(entry, "Array3"));
        }


        //now print 
        foreach (var model in dictionary)
        {
            model.ToString();
        }
    }

public class Model
{

    public Model()
    {
        Dictionary = new Dictionary<string, string>();
    }

    private Dictionary<string, string> Dictionary
    {
        get;
        set;
    }

    public bool ContainsEntry(string entry)
    {
        return Dictionary.ContainsValue(entry);
    }

    public void Add(string entry, string arrayName)
    {
        Dictionary.Add(arrayName, entry);
    }

    public override string ToString()
    {
        return "FORMATED AS YOU WANT THEM";
    }
}
于 2012-12-19T14:00:01.817 回答
1

您可以使用 linq 到 GroupJoin:

string[] arr1 = { "1155717", "5184305", "2531291", "1676341", "1916805" };
string[] arr2 = { "1155717", "1440230", "2531291", "8178626", "1916805" };
string[] arr3 = { "1155717", "5184305", "4025514", "1676341" };

var allPossibleTerms = arr1.Union(arr2).Union(arr3);

allPossibleTerms
    .GroupJoin(arr1, all => all, a1 => a1, (all, a1) => new { Number = all, A1 = a1 })
    .SelectMany(joined => joined.A1.DefaultIfEmpty(), (collection, result) => new { collection.Number, A1 = result})
    .GroupJoin(arr2, joined => joined.Number, a2 => a2, (collection, a2) => new { Number = collection.Number, A1 = collection.A1, A2 = a2 })
    .SelectMany(joined => joined.A2.DefaultIfEmpty(), (collection, result) => new { collection.Number, A1 = collection.A1, A2 = result})
    .GroupJoin(arr3, joined => joined.Number, a3 => a3, (collection, a3) => new { Number = collection.Number, A1 = collection.A1, A2 = collection.A2, A3 = a3 })
    .SelectMany(joined => joined.A3.DefaultIfEmpty(), (collection, result) => new { collection.Number, A1 = collection.A1, A2 = collection.A2, A3 = result});;

基本上,这会创建所有术语的主列表,并随时加入每个数组。

╔══════════════════════════════════════╗
║ Number   A1       A2       A3        ║
╠══════════════════════════════════════╣
║ 1155717  1155717  1155717  1155717   ║
║ 5184305  5184305  -------  5184305   ║
║ 2531291  2531291  2531291  -------   ║
║ 1676341  1676341  -------  1676341   ║
║ 1916805  1916805  1916805  -------   ║
║ 1440230  -------  1440230  -------   ║
║ 8178626  -------  8178626  -------   ║
║ 4025514  -------  -------  4025514   ║
╚══════════════════════════════════════╝
于 2012-12-19T14:02:13.377 回答