0

我无法读取文件然后将列表打印到另一个文件,逐行显示每个文件的出现次数。

我可以工作,但是它会将数字:1、2、3、4 和 5 打印到不在读取文件中的输出文件中

结构:

struct entry {
string word;
string word_uppercase;
int number_occurences;

};
//for array
entry myEntryArray[MAX_WORDS];
int addNewEntryHere=0; //next empty slot

我的主要调用 extractTokensFromLine 来读取并放入一个数组:

void extractTokensFromLine(std::string &myString) {
    const char CHAR_TO_SEARCH_FOR = ' ';
    stringstream ss(myString);
    string tempToken;
    //Extracts characters from is and stores them into str until the delimitation character delim is found
    while (getline(ss, tempToken, CHAR_TO_SEARCH_FOR)) {
        processTokenArray(tempToken);
    }   
}   

它逐字遍历每一行以放入一个数组中:

 void processTokenArray(string &token) {
    //temp uppercase for compare
    string strUpper = token;
    toUpper(strUpper);
    //see if its already there
    for (int i = 0; i < addNewEntryHere; ++i) {
        if (strUpper == myEntryArray[i].word_uppercase) {
            //yep increment count
            myEntryArray[i].number_occurences++;
            return;
        }
    }
    //nope add it
    myEntryArray[addNewEntryHere].word = token;
    myEntryArray[addNewEntryHere].number_occurences = 1;
    myEntryArray[addNewEntryHere].word_uppercase = strUpper;

    //where next place to add is
    addNewEntryHere++;
}

然后它将数组写入文件(每个单词及其出现次数):

bool writeArraytoFile(const std::string &OUTPUTFILENAME) {
    fstream outfile;
    if (!openFile(outfile,OUTPUTFILENAME,ios_base::out))
        return false;
    int var;
    for (var = 0; var < addNewEntryHere; ++var) {
        string word = myEntryArray[var].word;
        if(word != " " && word != "")
            outfile<<myEntryArray[var].word << " " <<IntToString(myEntryArray[var].number_occurences)<<std::endl;
    }
    closeFile(outfile);
    return true;
}

读取文件为 TestData.txt:

我想我应该喜欢一点黄油
如果不是
太麻烦的话,也可以来点吐司。当你在厨房里的时候,给我和我的男人来两杯特快专递。

我的输出文件(使用以下方法排序):

void sortVector(sortOrder so = NUMBER_OCCURRENCES) {
    bool shouldSwap = false;
    for (int var = 0; var < addNewEntryHereV; ++var) {
    for (int var1 = var+1; var1 < addNewEntryHereV; ++var1) {
        switch (so) {
            case ASCENDING:
                shouldSwap =!compareAscendingV(myEntryVector, var, var1);
                break;
                //TODO handle the following cases appropriately
            case DESCENDING:
                shouldSwap =!compareDescendingV(myEntryVector, var, var1);
                break;
            case NUMBER_OCCURRENCES:
                shouldSwap =!sortbyOccurrenceV(myEntryVector, var, var1);
                break;
            default:
                break;
            }
            if (shouldSwap){
                std::string tmp = myEntryVector._V.at(var);
                myEntryVector._V.at(var) = myEntryVector._V.at(var1);
                myEntryVector._V.at(var1) = tmp;
            }
        }
    }
}

实际输出:

3
4 2
of 2
a 2
I 2
在这里。1
人 1
我 1
我 1
for 1
expressos 1
支架 1
厨房 1
1 合
1
是 1
你 1
而 1
井。1
as 1
toast 1
some 1
trouble 1
much 1
too 1
5 1
3 1
2 1
1 1
not 1
its 1
If 1
butter 1
bit 1
like 1
should 1
think 1

任何形式的建议将不胜感激,谢谢

4

1 回答 1

0

在某些情况下,您的规格对我来说并不清楚,所以我猜想。这应该非常接近您正在尝试做的事情。

gcc 4.7.3:g++ -Wall -Wextra -std=c++0x word-freq.cpp

#include <algorithm>
#include <cctype>
#include <iostream>
#include <map>

typedef std::map<std::string, int> histogram_t;

std::string to_lower(const std::string& s) {
  std::string r(s);
  std::transform(std::begin(r), std::end(r), std::begin(r), ::tolower);
  return r; }

histogram_t word_freq(std::istream& is) {
  histogram_t m;
  std::string s;
  while (is >> s) { ++m[to_lower(s)]; }
  return m; }

void outAscWord(std::ostream& os, const histogram_t& m) {
  for (const auto& e : m) {
    os << e.first << " " << e.second << "\n"; } }

void outDescWord(std::ostream& os, const histogram_t& m) {
  for (auto i = m.crbegin(); i != m.crend(); ++i) {
    os << i->first << " " << i->second << "\n"; } }

template <class A, class B>
std::pair<B, A> flip_pair(const std::pair<A, B>& p) {
  return std::pair<B, A>(p.second, p.first); }

template <class A, class B>
std::multimap<B, A> flip_map(const std::map<A, B>& m) {
  std::multimap<B, A> r;
  std::transform(m.begin(), m.end(), std::inserter(r, r.begin()), flip_pair<A,B>);
  return r; }

void outAscCount(std::ostream& os, const histogram_t& m) {
  auto mm = flip_map(m);
  for (const auto& e : mm) {
    os << e.first << " " << e.second << "\n"; } }

int main() {
  // Can pass fstreams instead of iostreams if desired.
  auto m = word_freq(std::cin);
  outAscWord(std::cout, m);
  outDescWord(std::cout, m);
  outAscCount(std::cout, m);
}
于 2013-09-29T03:59:59.347 回答