I am trying to figure out the best data structure to use for this problem. I am implementing a key value store with keys that are strings. The values get added frequently and will generally only get looked up 1 or 2 times. Initially I used an std::map
, but I found the performance to be unoptimal, as the overhead of adding keys and rebalancing the red-black tree, overshadowed the decrease in time to search for a value. Currently I am using a modified single linked list. It uses a struct that contains a c string (const char *), the length in bytes, and the value stored. When I want to find a value using a key I iterate through the list and compare the size of the keys, if they match I use memcmp to check if the strings are identical. If they are identical, I return the value. I way able to achieve about 10x greater performance using this method over the std::map
. I need to make it about 2x more efficient, however. Can anyone recommend a better type of data structure, for this problem?
4 回答
// Assuming that the data is stored in std::string somewhere else
struct custom_compare {
bool operator()( std::string* lhs, std::string* rhs ) const {
return lhs!=rhs && (lhs->size() < rhs->size() || lhs->compare( *rhs ) < 0);
std::map< std::string*, data, custom_compare > mymap;
通过存储指针而不是实际的字符串,这将消除复制。自定义比较器基本上与您在列表中实现的比较器一样快,并且树将平衡内容,允许 O(log n) 查找。根据集合的大小(如果有很多元素),这将是对线性搜索的改进,而如果大小很小,那么线性搜索会更好。
如果标准实际上是基于字符串的内容(字母,而不是大小),那么您正在逼近 trie 的定义。如果你得到一个已经实现的库,或者你愿意花时间这样做,那么 trie 可能是这种查找最快的容器之一,因为它将“大小”变量从数量元素到字符串的长度。