4

为什么 C++ 集合实现为二叉树而不是哈希集,与二叉树提供的 O(log n) 相比,它可以提供 O(1) 的平均案例复杂度?

4

3 回答 3

17

因为 C++ 集合由T的比较运算符排序,这使得以可预测的方式迭代成员成为可能。如果你知道你对集合所做的只是插入、测试成员资格和/或删除元素,那么std::unordered_set自 C++11 以来就存在实现哈希集的 .

于 2013-05-14T04:31:44.690 回答
9

根据 John Nagle 的说法,从 2006 年发布到comp.lang.c++.moderated

实际原因是编写规范哈希表部分的人没有及时完成。就这样。

标准化过程就是这样。

于 2013-05-14T06:32:24.263 回答
1

There are two factors here:

  • both binary tree and hash based associative sets are useful
  • the original STL only implemented the former due to a lack of time to work through the "politics" of Standarisation; SGI, GNU, MS, boost and others have provided non-Standard hash-based versions for many years, and C++11 introduces unordered_set

These points are discussed below.

BOTH ARE USEFUL

There are lots of pros/cons, and both are available (as of C++11) to let the programmer choose. Hash-based sets were not available because there simply wasn't agreement on the implementation in time for them to be included in earlier standards.

  • set iteration is in sort order, whereas a hashed container unsorted_set is effectively randomised-order traversal
    • set supports lower_bound, upper_bound, which aren't practical to provide on an unsorted_set
  • set uses a comparison function (by default the contained type's operator<, whereas unordered_set requires a hashing function (defaults are provided for some types but they can be pretty mediocre depending on your actual keys, and quality hashing can be time consuming) and a key equality function
  • either may be faster for smaller values of N - not everyone's dealing with billions of elements so it's sensible to offer choice even from a performance perspective
  • there may be significant differences in memory usage for small objects, though I'm not sure what overall guidelines would be true across implementations so can just suggest measuring in your program if you care
  • as elements are added, existing iterators into a std::unordered_set may be invalidated (see 23.2.5p14 for when exactly), whereas iterators into a std::map are never invalidated by inserts (pointers and references to unordered_set elements remain valid though).

Why earlier C++ Standards didn't have a hash based set

From an interview with Stepanov:

Question: I find two hash table implementations in the D.Musser site, and they were both working and quite smart - much smarter than hash tables commonly found in class libraries. Why were hash tables not included into STL?

Answer: Politics. They have to be in. Our new implementation of STL does contain them. In general, we need to develop a mechanism for adding things to STL. After all, STL is an extensible framework, it begs to be extended. There are many data structures that are missing, for example, singly linked lists, matrices, and graphs. SGI is willing to lead the way in extending STL.

(full interview at http://www.stlport.org/resources/StepanovUSA.html)

于 2013-05-14T04:40:32.417 回答