java - LinkedHashSet 无法从 ArrayList 中删除重复的句子

Question

我正在构建一个 android/Java 程序，它从文本文件中读取并将每个句子存储在数组列表中的文本文件中。然后它检查句子中每个单词的出现情况，并打印出所有包含重复单词的句子。

这是我用来打印最终结果的代码：

    protected void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.text4);
    text = (TextView)findViewById(R.id.info2);
    BufferedReader reader = null;

    try {
        reader = new BufferedReader(
                new InputStreamReader(getAssets().open("input3.txt")));

        String line;

        List<String> sentences = new ArrayList<String>();

        for ( String line2; (line2 = reader.readLine()) != null;) {

            for (String sentence : line2.split("(?<=[.?!\t])")) {
                sentence = sentence.trim();
                if (! sentence.isEmpty()) {
                    sentences.add(sentence);
                }                   
            }  

            String[] keys = line2.split(" ");
            String[] uniqueKeys;

            int count = 0;
            uniqueKeys = getUniqueKeys(keys);

            for(String key: uniqueKeys)
            {
                if(null == key)
                {
                    break;
                }           
                for(String s : keys)
                {
                    if(key.equals(s))
                    {
                        count++;
                    }               
                }

                if(key.equals("a") || key.equals("the")|| key.equals("is")|| key.equals("of")|| key.equals("and")|| key.equals("The") || key.equals("some") || key.equals("on") || key.equals("during") || key.equals("to") || key.equals("since") || key.equals("in") || key.equals("by") || key.equals("for") || key.equals("were") ||key.equals("--") || key.equals("in") || key.equals("as") || key.equals("that") || key.equals("may") || key.equals("can") || key.equals("without") || key.equals("You")){
                    count = 0;
                }

                if(count >1 ){

                    MyKey = key;


                    Pattern word = Pattern.compile("\\b"+key+"\\b", Pattern.CASE_INSENSITIVE);

                    //sentences is the arrayList of sentences in this program
                    LinkedHashSet<String> lhs = new LinkedHashSet<String>();
                    for (String sentence : sentences) {
                        //checks the occurance of keyword within each sentence 
                        if (word.matcher(sentence).find()) {


                            lhs.add(sentence);


                        }                                          

                    }
                    for (String sentence2 : lhs) {
                        text.append(sentence2);                                     
                    }


                }
                count=0;
            }   


        }


    } catch (IOException e) {
         Toast.makeText(getApplicationContext(),"Error reading file!",Toast.LENGTH_LONG).show();
         e.printStackTrace();
    }finally {
        if (reader != null) {
            try {
                reader.close();
            } catch (IOException e) {
                //log the exception
            }            

        }

    }







}

我的程序首先读取一个文本文件，然后将我的文本文件中的每个句子存储在一个名为“句子”的句子数组列表中。
然后它读取文本文件中的每个单词，并将重复多次的每个单词存储在一个名为“key”的数组列表中。
然后它检查每个句子中是否存在“key”，如果存在，它将这些句子添加到名为“lhs”的 LinkedHashSet 中。
然后它应该在输出屏幕上显示 LinkedHashSet 中的所有句子。

在这种情况下，我的“key”的值是“rate”、“states”和“government”

但是，我的文本文件包含这句话：“13 个州报告的失业率高于当前的全国失业率。”

如您所见，它包含“状态”和“速率”，这是我的两个关键字。

当我运行这个程序时，这个特定的句子在输出屏幕上出现了两次，因为程序分别查找每个“键”，所以它认为它们是两个不同的句子。

这就是为什么我使用 LinkedHashSet 来防止这种情况，但它仍然在输出屏幕上显示这句话两次。

我应该如何解决这个问题？

score 0 · Accepted Answer

每次该单词与句子匹配时，您都在创建一个新的 LinkedHashSet 实例。

尝试这个：

//sentences is the arrayList of sentences in this program
LinkedHashSet<String> lhs = new LinkedHashSet<String>();  
for (String sentence : sentences) {
    //checks the occurance of keyword within each sentence 
    if (word.matcher(sentence).find()) {
        lhs.add(sentence);
        }
}

//displays the final result on the output window
String text = "";
for (String sentence2 : lhs) {
    text.append(sentence2);                                     
}

java - LinkedHashSet 无法从 ArrayList 中删除重复的句子

1 回答 1

Related

Reference