java - 在不使用 HashSet 的情况下搜索文本中的匹配词

Question

我正在尝试编写一个程序来读取文本文件，计算单词总数并确定文本中重复的单词以及它们出现的次数（为简单起见，文本文件不包含标点符号）。我有以下代码用于查找重复的单词，以及它们出现的次数。

ArrayList<String> words = new ArrayList<String>();
String myString;
String[] line;
// Read words from file, populate array         
while ((myString=br.readLine()) != null) {
    line = myString.split(" ");
        for (String word : line) {
            words.add(word.toLowerCase());  // Ignore case
        }
}

以上部分读取文本文件，并将每个单词添加到 ArrayList，words. 以下部分使用 HashSet 来确定 ArrayList 单词中哪些单词出现了多次。然后它打印出这些单词，然后是一个指示出现次数的计数器。

// Count the occurrences of each word
Set<String> unique = new HashSet<String>(words);
for (String key : unique) {
    if (Collections.frequency(words, key) > 1) {
        System.out.println(key + ": " + Collections.frequency(words, key));
    }
}

有没有办法在不使用 HashSet 的情况下做到这一点？例如通过使用两个数组，并同时比较它们？我尝试了以下方法：

        numWords = words.size();
        String[] wordArray = new String[numWords];
        String[] newWordArray = new String[wordArray.length];
        String compWord;

        // Find duplicates
        for(int i = 0; i < newWordArray.length; i++) {
            for(int j = 0; j < newWordArray.length; j++) {
                if( i != j && newWordArray[i].equals(newWordArray[j])) {
                    compWord = newWordArray[i];
                    System.out.println(compWord);
                }
            }
        }

这只会打印出多次出现的单词，因为它们是从文件中读取的。这意味着它会检测到重复的单词。但是，有没有办法以“[WORD : timesRepeated]”的形式获取这些单词？

java - 在不使用 HashSet 的情况下搜索文本中的匹配词

0 回答 0

Related

Reference