2

我有一张这样的地图:

std::map<unsigned,(string,timestamp)> themap.

但是我需要通过只保留地图中最高的 1000 个时间戳来管理地图的大小。我想知道处理这个问题的最有效方法是什么?

我应该以某种方式将上面的地图复制成一个

std::map<timestamp,(string,unsigned)> - erase elements not in top 1000, then massage this map back into original?

还是其他方式?

这是我的代码。

/* map of callid (a number) to a carid (a hex string) and a timestamp (just using unsigned here)
   The map will grow through time and potentially grow to some massive amount which would use up all 
   computer memory.  So max sixe of map can be kept to 1000 (safe to do so).  Want to remove items 
   based on timestamp
*/
#include <vector>
#include <string>
#include <iostream>
#include <algorithm>
#include <map>


typedef unsigned callid;
typedef unsigned timestamp;
typedef std::string carid;
typedef std::pair<carid,timestamp> caridtime;
typedef std::map<callid,caridtime> callid2caridmap;

int main() {

   //map of callid -> (carid,timestamp)
   callid2caridmap cmap;

   //test data below
   const std::string startstring("4559023584c8");
   std::vector<carid> caridvec;
   caridvec.reserve(1000);
   for(int i = 1; i < 2001; ++i) {
      char buff[20] = {0};
      sprintf(buff, "%04u", i);
      std::string s(startstring);
      s += buff;
      caridvec.push_back(s);
   }
   //generate some made up callids
   std::vector<callid> tsvec;
   for(int i = 9999; i < 12000; ++i) {
      tsvec.push_back(i);
   }

   //populate map
   for(unsigned i = 0; i < 2000; ++i)
      cmap[tsvec[i]] = std::make_pair(caridvec[i], i+1);

   //expiry handling
   static const int MAXNUMBER = 1000;

   // what I want to do is retain top 1000 with highest timestamps and remove all other entries.  
   // But of course map is ordered by the key
   // what is best approach.  std::transform??
   // just iterate each one.  But then I don't know what my criteria for erasing is until I have
   // found the largest 1000 items
   // std::for_each(cmap.begin(), cmap.end(), cleaner);

   //nth_element seems appropriate.  Do I reverse the map and have key as timestamp, use nth_element 
   //to work out what part to erase, then un-reverse the map as before with 1000 elements
   //std::nth_element(coll.begin(), coll.begin()+MAXNUMBER, coll.end()); 
   //erase from coll.begin()+MAXNUMBER to coll.end()

   return 0;
}

更新:

这是我正在使用的解决方案。

// as map is populated also fill queue with timestamp
std::deque<timestamp> tsq;
for(unsigned i = 0; i < 2000; ++i) {
   cmap[tsvec[i]] = std::make_pair(caridvec[i], i+1);
   tsq.push_back(tsvec[i]);
}
std::cout << "initial cmap size = " << cmap.size() << std::endl;


// expire old entries
static const int MAXNUMBER = 1000;
while(tsq.size() > MAXNUMBER) {
   callid2caridmap::iterator it = cmap.find(tsq.front());
   if(it != cmap.end())
      cmap.erase(it);

   tsq.pop_front();
}

std::cout << "cmap size now = " << cmap.size() << std::endl;

但仍然对任何可能的替代方案感兴趣。

4

1 回答 1

1

为地图中的对象创建一个最大堆时间戳 -> 迭代器。堆将是 <= 1000 个项目。检查当您在堆中插入时,您有 < 1000 个项目或时间戳是否 < 堆的最大值,并在从堆中弹出项目时执行相应的工作,如果所有这些都有意义的话

于 2013-10-17T12:16:40.993 回答