java - Java：检测 ArrayList 中的重复项？

Question

我如何去检测（返回真/假）一个 ArrayList 是否包含多个 Java 中的相同元素？

非常感谢，特里

编辑忘了提到我不是要比较“块”，而是要比较它们的整数值。每个“块”都有一个 int，这就是它们不同的原因。我通过调用名为“getNum”的方法（例如 table1[0][2].getNum();

score 214 · Accepted Answer

最简单的：将整个集合转储到一个 Set 中（使用 Set(Collection) 构造函数或 Set.addAll），然后查看 Set 是否与 ArrayList 具有相同的大小。

List<Integer> list = ...;
Set<Integer> set = new HashSet<Integer>(list);

if(set.size() < list.size()){
    /* There are duplicates */
}

更新：如果我正确理解你的问题，你有一个二维数组块，如

块表[][];

并且您想检测其中任何一行是否有重复项？

在这种情况下，假设 Block 正确实现了“equals”和“hashCode”，我可以执行以下操作：

for (Block[] row : table) {
   Set set = new HashSet<Block>(); 
   for (Block cell : row) {
      set.add(cell);
   }
   if (set.size() < 6) { //has duplicate
   }
}

对于语法，我不是 100% 确定这一点，因此将其写成可能更安全

for (int i = 0; i < 6; i++) {
   Set set = new HashSet<Block>(); 
   for (int j = 0; j < 6; j++)
    set.add(table[i][j]);
 ...

Set.addfalse如果要添加的项目已经在集合中，则返回布尔值 false，因此如果您只想知道是否有任何重复项，您甚至可以短路并放弃返回的任何添加项。

score 64 · Accepted Answer

改进的代码，使用返回值Set#add而不是比较列表和集合的大小。

public static <T> boolean hasDuplicate(Iterable<T> all) {
    Set<T> set = new HashSet<T>();
    // Set#add returns false if the set does not change, which
    // indicates that a duplicate element has been added.
    for (T each: all) if (!set.add(each)) return true;
    return false;
}

score 15 · Accepted Answer

15

如果您希望完全避免重复，那么您应该取消检测重复的中间过程并使用Set。

于 2009-02-18T21:30:14.260 回答

score 15 · Accepted Answer

使用 Java 8+，您可以使用 Stream API：

boolean areAllDistinct(List<Block> blocksList) {
    return blocksList.stream().map(Block::getNum).distinct().count() == blockList.size();
}

score 13 · Accepted Answer

改进了返回重复元素的代码

可以在集合中找到重复项
返回重复的集合
唯一元素可以从集合中获得

public static <T> List getDuplicate(Collection<T> list) {

    final List<T> duplicatedObjects = new ArrayList<T>();
    Set<T> set = new HashSet<T>() {
    @Override
    public boolean add(T e) {
        if (contains(e)) {
            duplicatedObjects.add(e);
        }
        return super.add(e);
    }
    };
   for (T t : list) {
        set.add(t);
    }
    return duplicatedObjects;
}


public static <T> boolean hasDuplicate(Collection<T> list) {
    if (getDuplicate(list).isEmpty())
        return false;
    return true;
}

score 9 · Accepted Answer

我需要为 a 做类似的操作Stream，但找不到一个好的例子。这就是我想出的。

public static <T> boolean areUnique(final Stream<T> stream) {
    final Set<T> seen = new HashSet<>();
    return stream.allMatch(seen::add);
}

这具有在早期发现重复而不是必须处理整个流时短路的优点，并且并不比将所有内容放入 aSet并检查大小要复杂得多。所以这种情况大致是：

List<T> list = ...
boolean allDistinct = areUnique(list.stream());

score 8 · Accepted Answer

如果您的元素在某种程度上是 Comparable （顺序具有任何实际含义的事实是无关紧要的 - 它只需要与您对相等性的定义一致），最快的重复删除解决方案将对列表进行排序（ 0(n log( n)) ) 然后进行一次遍历并查找重复的元素（即，彼此跟随的相等元素）（这是 O(n)）。

总体复杂度将是 O(n log(n))，这与使用 Set 得到的大致相同（n 倍 long(n)），但常数要小得多。这是因为排序/去重中的常数来自比较元素的成本，而集合的成本最有可能来自哈希计算，加上一次（可能是多次）哈希比较。如果您使用的是基于哈希的 Set 实现，也就是说，因为基于 Tree 的结果会为您提供 O(n log²(n))，这就更糟了。

但是，据我了解，您不需要删除重复项，而只需测试它们的存在。所以你应该在你的数组上手动编码一个合并或堆排序算法，如果你的比较器返回 0，它只是退出返回 true（即“有一个 dup”），否则完成排序，并遍历排序的数组测试重复. 实际上，在合并或堆排序中，当排序完成时，您将比较每个重复的对，除非两个元素都已经在它们的最终位置（这不太可能）。因此，调整后的排序算法应该会产生巨大的性能改进（我必须证明这一点，但我猜调整后的算法应该在 O(log(n)) 中对均匀随机数据）

score 2 · Accepted Answer

如果您想要一组重复值：

import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

public class FindDuplicateInArrayList {

    public static void main(String[] args) {

        Set<String> uniqueSet = new HashSet<String>();
        List<String> dupesList = new ArrayList<String>();
        for (String a : args) {
            if (uniqueSet.contains(a))
                dupesList.add(a);
            else
                uniqueSet.add(a);
        }
        System.out.println(uniqueSet.size() + " distinct words: " + uniqueSet);
        System.out.println(dupesList.size() + " dupesList words: " + dupesList);
    }
}

并且可能还考虑修剪值或使用小写字母......取决于您的情况。

score 1 · Accepted Answer

简单地说：1）确保所有项目都是可比较的 2）对数组进行排序 2）遍历数组并找到重复项

score 1 · Accepted Answer

要知道列表中的重复项，请使用以下代码：它将为您提供包含重复项的集合。

 public Set<?> findDuplicatesInList(List<?> beanList) {
    System.out.println("findDuplicatesInList::"+beanList);
    Set<Object> duplicateRowSet=null;
    duplicateRowSet=new LinkedHashSet<Object>();
            for(int i=0;i<beanList.size();i++){
                Object superString=beanList.get(i);
                System.out.println("findDuplicatesInList::superString::"+superString);
                for(int j=0;j<beanList.size();j++){
                    if(i!=j){
                         Object subString=beanList.get(j);
                         System.out.println("findDuplicatesInList::subString::"+subString);
                         if(superString.equals(subString)){
                             duplicateRowSet.add(beanList.get(j));
                         }
                    }
                }
            }
            System.out.println("findDuplicatesInList::duplicationSet::"+duplicateRowSet);
        return duplicateRowSet;
  }

score 1 · Accepted Answer

处理此问题的最佳方法是使用HashSet：

ArrayList<String> listGroupCode = new ArrayList<>();
listGroupCode.add("A");
listGroupCode.add("A");
listGroupCode.add("B");
listGroupCode.add("C");
HashSet<String> set = new HashSet<>(listGroupCode);
ArrayList<String> result = new ArrayList<>(set);

只需打印结果arraylist 并查看没有重复的结果 :)

score 1 · Accepted Answer

这个答案是用 Kotlin 写的，但可以很容易地翻译成 Java。

如果您的数组列表的大小在一个固定的小范围内，那么这是一个很好的解决方案。

var duplicateDetected = false
    if(arrList.size > 1){
        for(i in 0 until arrList.size){
            for(j in 0 until arrList.size){
                if(i != j && arrList.get(i) == arrList.get(j)){
                    duplicateDetected = true
                }
            }
        }
    }

score 1 · Accepted Answer

private boolean isDuplicate() {
    for (int i = 0; i < arrayList.size(); i++) {
        for (int j = i + 1; j < arrayList.size(); j++) {
            if (arrayList.get(i).getName().trim().equalsIgnoreCase(arrayList.get(j).getName().trim())) {
                return true;
            }
        }
    }

    return false;
}

score 0 · Accepted Answer

    String tempVal = null;
    for (int i = 0; i < l.size(); i++) {
        tempVal = l.get(i); //take the ith object out of list
        while (l.contains(tempVal)) {
            l.remove(tempVal); //remove all matching entries
        }
        l.add(tempVal); //at last add one entry
    }

注意：这将对性能造成重大影响，因为项目已从列表的开头删除。为了解决这个问题，我们有两个选择。1）以相反的顺序迭代并删除元素。2) 使用 LinkedList 而不是 ArrayList。由于在采访中提出的有偏见的问题是在不使用任何其他集合的情况下从 List 中删除重复项，因此上面的示例就是答案。但在现实世界中，如果我必须实现这一点，我会将元素从 List 放入 Set，很简单！

score 0 · Accepted Answer

/**
     * Method to detect presence of duplicates in a generic list. 
     * Depends on the equals method of the concrete type. make sure to override it as required.
     */
    public static <T> boolean hasDuplicates(List<T> list){
        int count = list.size();
        T t1,t2;

        for(int i=0;i<count;i++){
            t1 = list.get(i);
            for(int j=i+1;j<count;j++){
                t2 = list.get(j);
                if(t2.equals(t1)){
                    return true;
                }
            }
        }
        return false;
    }

已覆盖的具体类的示例equals()：

public class Reminder{
    private long id;
    private int hour;
    private int minute;

    public Reminder(long id, int hour, int minute){
        this.id = id;
        this.hour = hour;
        this.minute = minute;
    }

    @Override
    public boolean equals(Object other){
        if(other == null) return false;
        if(this.getClass() != other.getClass()) return false;
        Reminder otherReminder = (Reminder) other;
        if(this.hour != otherReminder.hour) return false;
        if(this.minute != otherReminder.minute) return false;

        return true;
    }
}

score 0 · Accepted Answer

    ArrayList<String> withDuplicates = new ArrayList<>();
    withDuplicates.add("1");
    withDuplicates.add("2");
    withDuplicates.add("1");
    withDuplicates.add("3");
    HashSet<String> set = new HashSet<>(withDuplicates);
    ArrayList<String> withoutDupicates = new ArrayList<>(set);

    ArrayList<String> duplicates = new ArrayList<String>();

    Iterator<String> dupIter = withDuplicates.iterator();
    while(dupIter.hasNext())
    {
    String dupWord = dupIter.next();
    if(withDuplicates.contains(dupWord))
    {
        duplicates.add(dupWord);
    }else{
        withoutDupicates.add(dupWord);
    }
    }
  System.out.println(duplicates);
  System.out.println(withoutDupicates);

java - Java：检测 ArrayList 中的重复项？

16 回答 16

Related

Reference