java - 从列表中查找最常见的对象

Question

假设我有一个List对象Employee。Employee对象有一个getDepartment返回对象的方法Department。我想遍历该列表以找到具有最多Employees 的部门（即Department最常从 s 返回的对象getDepartment）。最快的方法是什么？

public class Employee{

   static allEmployees = new ArrayList<Employee>();       

   int id;
   Department department;

   public Employee(int id, Department department){
     this.id = id;
     this.department = department;
     allEmployees.add(this);
   }

   public Department getDepartment(){
     return department;
   }

   public static List<Employee> getAllEmployees(){
      return allEmployees;
   }
}

public class Department{
   int id;
   String name;

   public Department(int id){
     this.id = id;
   }

   public String getName(){
     return name;
   }
}

如果有两个部门的员工人数相等，则返回哪个部门并不重要。

谢谢！

score 3 · Accepted Answer

创建一个部门 id 的地图 -> 计数。

这样你就可以通过 id 获得所有计数的集合。您还可以维护一个最大项，它是对具有最高计数的映射条目的引用。

该算法将类似于：

1) 初始化 Map 和 currentMax
2) 循环遍历员工
3) 为每个员工获取其部门 id
4) 执行类似 map.get(currentId)
a) 如果当前计数为空，则初始化它
5) 增加计数
6) 如果增加的计数 > currentMax，则更新 currentMax

该算法将在 O(n) 中运行；我不认为你能得到比这更好的了。它的空间复杂度也是 O(n)，因为计数的数量与输入的大小成正比。

如果您愿意，您可以创建一个使用组合的类（即包含一个 Map 和一个 List），并管理保持指向具有最高计数的条目的指针。这样，您的这部分功能就被正确封装了。这种方法的更大好处是它允许您在将项目输入列表时保持计数（您将代理将员工添加到列表的方法，以便他们更新地图计数器）。不过可能有点矫枉过正。

score 2 · Accepted Answer

这是一个普通的 Java 8 解决方案：

Employee.getAllEmployees()
        .stream()
        .collect(Collectors.groupingBy(Employee::getDepartment, Collectors.counting()))
        .entrySet()
        .stream()
        .max(Comparator.comparing(Entry::getValue))
        .ifPresent(System.out::println);

它最多通过员工列表两次。如果您愿意添加第三方依赖项，使用jOOλ的等效解决方案是：

Seq.seq(Employee.getAllEmployees())
   .grouped(Employee::getDepartment, Agg.count())
   .maxBy(Tuple2::v2)
   .ifPresent(System.out::println);

（免责声明：我为 jOOλ 背后的公司工作）

score 1 · Accepted Answer

我会使用Guava做这样的事情：

Multiset<Department> departments = HashMultiset.create();
for (Employee employee : employees) {
  departments.add(employee.getDepartment());
}

Multiset.Entry<Department> max = null;
for (Multiset.Entry<Department> department : departments.entrySet()) {
  if (max == null || department.getCount() > max.getCount()) {
    max = department;
  }
}

你需要一个正确的实现equals和hashCodeonDepartment才能工作。

这里还有一个问题，它提到了Multiset将来创建“排行榜”类型的可能性，该类型将根据其包含的每个条目的计数来维护订单。

score 0 · Accepted Answer

由于您只想计算员工人数，因此制作地图相对容易。

HashMap<Department, Integer> departmentCounter;

将部门映射到员工数量（您增加每个员工的计数）。或者，您可以使用列表将整个 Employee 存储在地图中：

HashMap<Department, List<Employee>> departmentCounter;

并查看列表的大小。

然后如果你不知道如何使用该类可以查看 HashMap 文档：http: //download.oracle.com/javase/1.4.2/docs/api/java/util/HashMap.html

提示：您将需要使用 HashMap.keySet() 来查看已输入的部门。

score 0 · Accepted Answer

我会这样做，模 == null 和 isEmpty 检查：

public static <C> Multimap<Integer, C> getFrequencyMultimap(final Collection<C> collection,
    final Ordering<Integer> ordering) {
    @SuppressWarnings("unchecked")
    Multimap<Integer, C> result = TreeMultimap.create(ordering, (Comparator<C>) Ordering.natural());
    for (C element : collection) {
        result.put(Collections.frequency(collection, element), element);
    }
    return result;
}

public static <C> Collection<C> getMostFrequentElements(final Collection<C> collection)       {
    Ordering<Integer> reverseIntegerOrdering = Ordering.natural().reverse();
    Multimap<Integer, C> frequencyMap = getFrequencyMultimap(collection, reverseIntegerOrdering);
    return frequencyMap.get(Iterables.getFirst(frequencyMap.keySet(), null));
}

还有CollectionUtils.getCardinalityMap()将完成第一种方法的工作，但这种方法更灵活，更热情。

请记住，C 类应该很好地实现，即具有 equals()、hashCode() 并实现 Comparable。

这是您可以使用它的方式：

Collection<Dummy> result = LambdaUtils.getMostFrequentElements(list);

作为奖励，您还可以使用类似的方法获得频率较低的元素，只需使用 Ordering.natural() 提供第一个方法并且不要反转它。

java - 从列表中查找最常见的对象

5 回答 5

Related

Reference