c++ - 计算某些单词在 C++ 文本文件中出现的次数

Question

我正在尝试用两个不同的文本文件制作一个程序。其中一个包含我要分析的实际文本，另一个包含单词列表。该程序应该检查列表中的单词何时出现在文本中并对其进行计数。这是我到目前为止的（非工作）代码：

#include <iostream>
#include <string>
#include <fstream>

using namespace std;

int main () {

    string word1;
    string word2;
    int listHits = 0;

    ifstream data1 ("text.txt");
    if ( ! data1 ) {
    cout << "could not open file: " << "text.txt" << endl;
        exit ( EXIT_FAILURE );
  }

    ifstream data2 ("list.txt");
    if ( ! data2 ) {
    cout << "could not open file: " << "list.txt" << endl;
        exit ( EXIT_FAILURE );
  }

    while ( data1 >> word1 ) {
        while ( data2 >> word2 ) {
            if ( word1 == word2 ) {
                listHits++;
            }
        }
    }

    cout << "Your text had " << listHits << " words from the list " << endl;

    system("pause");

    return 0;
}

如果 text.txt 包含

这里有一段文字。它将被加载到程序中。

和 list.txt 包含

将

预期的结果是 3。但是，无论文本文件中有什么，程序总是给我答案 0。我已经检查了程序实际上是否设法通过计算它执行循环的次数来读取文件，并且它作品。

提前致谢

score 1 · Accepted Answer

你的程序只通过一次“目标词列表”（即data2）文件。文件流是“一种方式”：一旦你用完它，你需要倒带它，或者它会留在最后。内循环

while ( data2 >> word2 )
    ...

将只执行第一次，即 . 的第一个单词data1。对于第二个和所有后续单词， thedata2已经在文件末尾，因此代码甚至不会进入循环。

您应该在内存中阅读目标词，并在内部循环中使用该列表。更好的是，将单词放在 a 中set<string>，然后使用该集合进行计数。

score 1 · Accepted Answer

在我看来，您总是只将第一个文件的第一个字母与整个第二个文件进行比较，您这样做：

  while ( data1 >> word1 ) {
        while ( data2 >> word2 ) { // <---- after this ends the first time, it will never enter again
            if ( word1 == word2 ) {
                listHits++;
            }
        }

您需要在第二个循环完成后“重置” data2 以便它从文件开头再次开始读取：

 while ( data1 >> word1 ) {
        while ( data2 >> word2 ) {
            if ( word1 == word2 ) {
                listHits++;
            }    
        }
        data2.seekg (0, data2.beg);
   }

c++ - 计算某些单词在 C++ 文本文件中出现的次数

2 回答 2

Related

Reference