java - 删除数组重复项

Question

我正在尝试从数组中删除重复项，但它不起作用。

我错过了什么吗？

代码：-

class RemoveStringDuplicates {

    public static char[] removeDups(char[] str) {
        boolean bin_hash[] = new boolean[256];
        int ip_ind = 0, res_ind = 0;
        char temp;

        while (ip_ind < str.length) {
            temp = str[ip_ind];
            if (bin_hash[temp] == false) {
                bin_hash[temp] = true;
                str[res_ind] = str[ip_ind];
                res_ind++;
            }
            ip_ind++;
        }

        return str;
    }

    public static void main(String[] args) {
        char str[] = "test string".toCharArray();
        System.out.println(removeDups(str));
    }
}

输出：-

 tes ringing //ing should not have been repeated!

score 2 · Accepted Answer

您应该使用一个新数组，而不是将字符分配到同一个数组中。因为，在删除重复项后，尾随元素不会被删除，因此会被打印。

因此，如果您使用新数组，则尾随元素将是null字符。

所以，只需创建一个新数组：

char[] unique = new char[str.length];

然后更改分配：

str[res_ind] = str[ip_ind];

至：

unique[res_ind] = str[ip_ind];

此外，您可以考虑使用 anArrayList而不是array. 这样你就不必boolean为每个字符维护一个数组，这太多了。您正在失去一些不需要的额外空间。使用ArrayList，您可以使用该contains方法检查已添加的字符。

Set好吧，您还可以通过使用自动删除重复项来避免手动进行所有这些计数。但大多数实现不维护插入顺序。为此，您可以使用LinkedHashSet.

score 1 · Accepted Answer

具体问题已经找到了解决方案，但是如果你不限制使用自己的方法并且可以使用java库，我建议这样：

public class RemoveDuplicates {

// Note must wrap primitives for generics
// Generic array creation not supported by java, gotta return a list

public static <T> List<T> removeDuplicatesFromArray(T[] array) {
    Set<T> set = new LinkedHashSet<>(Arrays.asList(array));
    return new ArrayList<>(set);
}

public static void main(String[] args) {
    String s = "Helloo I am a string with duplicates";
    Character[] c = new Character[s.length()];

    for (int i = 0; i < s.length(); i++) {
        c[i] = s.charAt(i);
    }

    List<Character> noDuplicates = removeDuplicatesFromArray(c);
    Character[] noDuplicatesArray = new Character[noDuplicates.size()];
    noDuplicates.toArray(noDuplicatesArray);

    System.out.println("List:");
    System.out.println(noDuplicates);
    System.out.println("\nArray:");
    System.out.println(Arrays.toString(noDuplicatesArray));
}
}

出去：

List:
[H, e, l, o,  , I, a, m, s, t, r, i, n, g, w, h, d, u, p, c]

Array:
[H, e, l, o,  , I, a, m, s, t, r, i, n, g, w, h, d, u, p, c]

linkedhashset 保留了顺序，这对于字符数组之类的东西可能尤其重要。

score 0 · Accepted Answer

尝试这个：

public static char[] removeDups(char[] str) {
        boolean bin_hash[] = new boolean[256];
        int ip_ind = 0, res_ind = 0;
        char temp;
        char a[] = new char[str.length];

        while (ip_ind < str.length) {
            temp = str[ip_ind];
            if (bin_hash[temp] == false) {
                bin_hash[temp] = true;
                a[res_ind] = str[ip_ind];
                res_ind++;
            }
            ip_ind++;
        }

        return a;
    }

您基本上是在循环中更新 str 变量。更新它并再次循环更新的数组。

score 0 · Accepted Answer

我相信问题是由于str您在修改它时正在迭代（通过 line str[res_ind] = str[ip_ind]）。如果将结果复制到另一个数组，它可以工作：

class RemoveStringDuplicates {

    public static char[] removeDups(char[] str) {
        char result[] = new char[str.length];
        boolean bin_hash[] = new boolean[256];
        int ip_ind = 0, res_ind = 0;
        char temp;

        while (ip_ind < str.length) {
            temp = str[ip_ind];
            if (bin_hash[temp] == false) {
                bin_hash[temp] = true;
                result[res_ind] = str[ip_ind];
                res_ind++;
            }
            ip_ind++;
        }

        return result;
    }

    public static void main(String[] args) {
        char str[] = "test string".toCharArray();
        System.out.println(removeDups(str));
    }
}

score 0 · Accepted Answer

所有其他答案似乎都是正确的。您在结果末尾看到的“ing”实际上是数组中已经存在的未触及字符。

作为替代解决方案（如果您想节省内存），您可以循环遍历数组的最后一部分以删除末尾的字符，因为您已经知道它们是重复的。

//C# code, I think you just need to change str.Length here to str.length
for (int delChars = res_ind; delChars < str.Length; delChars++)
{
    str[delChars] = '\0';
}

score 0 · Accepted Answer

您的代码完全滥用了 Java 语言。标准库中的数据结构类是使用Java的重点。使用它们。

编写代码来做你想做的事情的正确方法是在这里：

class RemoveStringDuplicates {

    public static String removeDups(CharSequence str) {

        StringBuilder b = new StringBuilder(str);
        HashSet<Character> s = new HashSet<Character>();

        for(int idx = 0; idx < b.size(); idx++)
            if(mySet.contains(b.charAt(idx)))
                b.deleteCharAt(idx--);
            else
                s.add(ch);

        return b.toString();
    }

    public static void main(String[] args) {
        System.out.println(removeDups(str));
    }
}

也可能有更好的方法来做到这一点。不要回避 Java 的数据结构。

如果您正在编写对性能足够敏感的代码，以至于您必须在问题中使用类似的原始代码，那么您应该使用不同的语言，例如 C。

java - 删除数组重复项

6 回答 6

Related

Reference