4

基于我之前的问题,我有一个单词映射及其存储在map<string, int>. 我想扭转这一点,以便将所有具有相同计数的单词组合在一起。我的解决方案是使用vector<vector<string> >. 第一个向量的索引是计数,第二个向量是单词的集合。

在阅读了上一个问题的答案后,这里是我一直在努力的工作:

  vector<vector<string> > sorted_words;
    for (map<string, int>::const_iterator it = counters.begin();
       it != counters.end(); ++it) {
    cout << "word:" << it->first
         << " count:" << it-> second
         << " vector size: " << sorted_words.size()
         << endl;

    if (sorted_words.size() - 1 > it->second && 
        sorted_words[ it->second ].size() > 0) {
      cout << "Adding " << it->first << endl;
      sorted_words[ it->second ].push_back(it->first);
    } else {
      cout << "Creating " << it->first << endl;
      vector<string> v;
      v.push_back(it->first);
      sorted_words.resize( it->second + 1 );
      sorted_words[it->second] = v;
    }
  }

这会导致在 if 语句处循环的第一遍出现段错误。

我要做的是查看外部向量的大小是否使我的当前值是入站的,如果是,如果我已经创建了一个嵌套向量。(我需要这样做,因为地图可以按任何顺序返回。例如,第一个元素可能是 <"foo", 3>。)

如果我以一种根本上非 C++ 的方式来处理它,请随时指出这一点。

4

4 回答 4

4

快速胡思乱想: sorted_words.size()是某种无符号类型(即size_t),因此sorted_words.size() - 1即使它应该是-1(最初)也是无符号的,因此您总是通过第一个条件和 if 条件的后半部分求值并崩溃。

于 2013-04-28T04:44:20.307 回答
3

对于空间,您可能会更好地使用std::map<int, std::vector<string>>. 以下相当简单的代码(可以通过小写所有单词和剥离标点符号来改进)演示:

#include <iostream>
#include <fstream>
#include <sstream>
#include <vector>
#include <map>
using namespace std;

int main(int argc, char *argv[])
{
    if (argc < 2)
        return EXIT_FAILURE;

    // map of strings to counts.
    std::map<string, int> strs;
    ifstream inf(argv[1]);
    string str;
    while (inf >> str)
        ++strs[str];

    // map of counts to strings, smallest to largest.
    std::map<int, std::vector<string>> vals;
    for (auto it : strs)
        vals[ it.second ].push_back(it.first);

    // report counts for each
    for (auto it : vals)
    {
        cout << "Count: " << it.first << ": ";
        std::copy(it.second.begin(), it.second.end(),
                  ostream_iterator<string>(cout, " "));
        cout << endl;
    }
}

样本输入

我选择了 W. Shakespeare 的《As You Like It》中的一段独白,它有一些有趣的属性,稍后你会看到:

All the world's a stage,
And all the men and women merely players:
They have their exits and their entrances;
And one man in his time plays many parts,
His acts being seven ages. At first, the infant,
Mewling and puking in the nurse's arms.
And then the whining school-boy, with his satchel
And shining morning face, creeping like snail
Unwillingly to school. And then the lover,
Sighing like furnace, with a woeful ballad
Made to his mistress' eyebrow. Then a soldier,
Full of strange oaths and bearded like the pard,
Jealous in honour, sudden and quick in quarrel,
Seeking the bubble reputation
Even in the cannon's mouth. And then the justice,
In fair round belly with good capon lined,
With eyes severe and beard of formal cut,
Full of wise saws and modern instances;
And so he plays his part. The sixth age shifts
Into the lean and slipper'd pantaloon,
With spectacles on nose and pouch on side,
His youthful hose, well saved, a world too wide
For his shrunk shank; and his big manly voice,
Turning again toward childish treble, pipes
And whistles in his sound. Last scene of all,
That ends this strange eventful history,
Is second childishness and mere oblivion,
Sans teeth, sans eyes, sans taste, sans everything.

样本输出

 Count: 1: All At Even For In Into Is Jealous Last Made Mewling Sans Seeking Sighing That The Then They Turning Unwillingly acts again age ages. all all, arms. ballad beard bearded being belly big bubble cannon's capon childish childishness creeping cut, ends entrances; eventful everything. exits eyebrow. eyes eyes, face, fair first, formal furnace, good have he history, honour, hose, infant, instances; justice, lean lined, lover, man manly many men mere merely mistress' modern morning mouth. nose nurse's oaths oblivion, one pantaloon, pard, part. parts, pipes players: pouch puking quarrel, quick reputation round satchel saved, saws scene school-boy, school. second seven severe shank; shifts shining shrunk side, sixth slipper'd snail so soldier, sound. spectacles stage, sudden taste, teeth, this time too toward treble, voice, well whining whistles wide wise woeful women world world's youthful 
 Count: 2: Full His With on plays strange their to 
 Count: 3: like sans then with 
 Count: 4: a of 
 Count: 6: in 
 Count: 7: his 
 Count: 8: And 
 Count: 11: and the 

有趣的是,独白中有多少独特的词串。几乎就像他计划的那样。但是,在考虑大写和标点删除时,这些数字明显不同。值得庆幸的是,这样做也很简单,只更改了第一个 while 循环:

while (inf >> str)
{
    string alpha;
    for_each(str.begin(), str.end(),
            [](char& c){c=tolower(static_cast<unsigned char>(c));});
    copy_if(str.begin(), str.end(), back_inserter(alpha),
            [](const char& c){return isalpha(static_cast<unsigned char>(c));});
    ++strs[alpha];
}

这给了我们以下结果:

 Count: 1: acts again age ages arms at ballad beard bearded being belly big bubble cannons capon childish childishness creeping cut ends entrances even eventful everything exits eyebrow face fair first for formal furnace good have he history honour hose infant instances into is jealous justice last lean lined lover made man manly many men mere merely mewling mistress modern morning mouth nose nurses oaths oblivion one pantaloon pard part parts pipes players pouch puking quarrel quick reputation round satchel saved saws scene school schoolboy second seeking seven severe shank shifts shining shrunk side sighing sixth slipperd snail so soldier sound spectacles stage sudden taste teeth that they this time too toward treble turning unwillingly voice well whining whistles wide wise woeful women world worlds youthful 
 Count: 2: eyes full on plays strange their to 
 Count: 3: all like 
 Count: 4: a of sans then 
 Count: 5: with 
 Count: 7: in 
 Count: 9: his 
 Count: 12: the 
 Count: 19: and 

仍然,相当令人印象深刻,比利。

由于第一个地图排序的性质,作为额外的奖励,您可以按字母顺序获得结果单词列表。争取奖励功能。

于 2013-04-28T04:49:38.657 回答
0

如果要反转 a map<K, V>,请使用 a map<V, vector<K>>。如果您不关心此时的实际计数,您可以vector<vector<V>>通过从中间映射移动来有效地构建。例如:

vector<vector<string>> invert(const map<string, int>& input) {

  map<int, vector<string>> inverse;
  for (const auto& pair : input)
    inverse[pair.second].push_back(pair.first);

  vector<vector<string>> result;
  for (auto& pair : inverse)
    result.push_back(move(pair.second));

  return result;

}
于 2013-04-28T04:54:10.767 回答
0

您可以利用已有的 map ( ) 并创建 a来收集所有具有相同计数的单词的索引,而不是使用vector<vector<string>>类似的映射或倒置映射。这将为您节省大量空间。此外,当您修改以前的所有内容时,您需要做的就是更新索引,其中将比其他两种方法更快。map<int, vector>map<string, int>vector<vector<int>>map<string, int>vector<vector<int>>

于 2013-04-28T05:02:43.960 回答