c# - C# 从包含列表的列表中删除重复项

Question

假设我们有一个“A 学生”列表和一个“B 学生”列表。然后我们将这两个列表添加到一个更通用的列表中，称为“学生”。然后有人决定通过在通用“学生”列表中添加重复的“学生”列表来使我们的生活复杂化。删除“A学生”重复列表之一的最有效方法是什么？请注意，其中涉及两个自定义类。

代码中的通用学生列表称为 lstStudents。这是我想从中删除任何重复项的列表。

（我试图想出一个更好的例子，但这是我现在能做的最好的。）

我不必使用 LINQ，但它是可用的。MoreLinq 也可用。

这是我的课程：

public class Student
{
    public Student(string _name, int _age, Exam _lastExam)
    {
        name = _name;
        age = _age;
        lastExam = _lastExam;
    }

    public string name { get; set; }
    public int age { get; set; }
    public Exam lastExam { get; set; }
}

public class Exam
{
    public Exam(int _correct, int _possible)
    {
        correct = _correct;
        possible = _possible;
    }

    public int correct { get; set; }
    public int possible { get; set; }
}

这是创建混乱的代码：

List<List<Student>> lstStudents = new List<List<Student>>();
List<Student> lstAStudents = new List<Student>();
List<Student> lstDuplicateAStudents = new List<Student>();
List<Student> lstBStudents = new List<Student>();

// Create a list of some A students
lstAStudents.Add(new Student("Alex", 14, new Exam(98,100)));
lstAStudents.Add(new Student("Kim", 13, new Exam(96, 100)));
lstAStudents.Add(new Student("Brian", 14, new Exam(92, 100)));
lstStudents.Add(lstAStudents);

// Create a duplicate list of A students
lstDuplicateAStudents.Add(new Student("Alex", 14, new Exam(98, 100)));
lstDuplicateAStudents.Add(new Student("Kim", 13, new Exam(96, 100)));
lstDuplicateAStudents.Add(new Student("Brian", 14, new Exam(92, 100)));
lstStudents.Add(lstDuplicateAStudents);

// Create a list of some B students
lstBStudents.Add(new Student("John", 13, new Exam(88, 100)));
lstBStudents.Add(new Student("Jenny", 13, new Exam(80, 100)));
lstBStudents.Add(new Student("Jamie", 15, new Exam(81, 100)));
lstStudents.Add(lstBStudents);

score 4 · Accepted Answer

可能您可以持有一组将累积唯一列表的集合：

var set = new HashSet<List<Student>>(new CustomComparer());
foreach (List<List<Student>> list in source)
{
  if (set.Contains(list))
    continue;
  set.Add(list)
}


public class CustomComparer : IEqualityComparer<List<Student>>
{
   public bool Equals(List<Student> one, List<Student> two)
   {
     if (one.Count != two.Count) return false;

     // simplest possible code to compare two lists
     // warning: runs in O(N*logN) for each compare
     return one.OrderBy(s=>s).SequenceEqual(two.OrderBy(s=>s));
   }

   public int GetHashCodeList<Student> item)
   {
     int ret = -1;
     foreach (var s in item)
       ret ^= s.GetHashCode();
     return ret;
   }
}

此解决方案的主要问题是用于比较两个列表<> 的代码。包含不同顺序的相同元素的列表是否被视为相等？如果是，我们需要通过对每个列表进行预排序来更改顺序（以节省比较时间），或者每次对每个列表的副本进行排序，这将导致额外的时间损失。所以我想主要问题是你的名单有多大。对于低于 1000 个学生/100 个列表的值，性能问题应该不会很明显。

另一个问题是 GetHashCode 实现 - 它是 O(N)，我们无处缓存计算值，因为 List 是一个框架结构。为了解决这个问题，我建议引入 StudentList 类，该类将具有比较器（现在我们必须在外部指定它）并通过缓存获取哈希码。

此外，还有一个更好的通用集合等价比较器实现。

score 1 · Accepted Answer

您可以IEquatable<T>同时使用Student和Exam：

public class Student: IEquatable<Student>
{
    ...

    public bool Equals(Student other)
    {
        return name == other.name && age == other.age 
                    && lastExam.Equals(other.lastExam);
    }

    public override bool Equals(object obj)
    {
        Student student = obj as Student;
        return Equals(student);
    }

    public override int GetHashCode()
    {
        return name.GetHashCode() ^ 
             age.GetHashCode() ^ lastExam.GetHashCode();
    }
}

对于Exam：

public class Exam: IEquatable<Exam>
{
    ...

    public bool Equals(Exam exam)
    {
        return exam.correct == correct && exam.possible == possible;
    }

    public override bool Equals(object obj)
    {
        Exam exam = obj as Exam;
        return Equals(exam);
    }

    public override int GetHashCode()
    {
        return correct.GetHashCode() ^ possible.GetHashCode();
    }
}

然后建立一个自IQualityComparer<T>定义List<Student>：

public class StudentListComparer : IEqualityComparer<List<Student>>
{
    public bool Equals(List<Student> x, List<Student> y)
    {
        return x.OrderBy(a => a.name)
                .SequenceEqual(y.OrderBy(b => b.name));
    }

    public int GetHashCode(List<Student> obj)
    {
        return obj.Aggregate(0, (current, t) => current ^ t.GetHashCode());
    }
}

然后你可以Distinct得到结果：

var result = lstStudents.Distinct(new StudentListComparer());

c# - C# 从包含列表的列表中删除重复项

2 回答 2

Related

Reference