java - Hashmap 以及它是如何在幕后工作的

Question

快速提问以确保我很好地理解了 Java 中的 HashMap 是如何工作的。

这是一个代码示例：

//String key = new String("key");
//String val = new String("value");
String key = "key";
String val = "value";

HashMap map = new HashMap();
map.put(key, val);

System.out.println("hashmap object created. Its key hashcode = "+key.hashCode());
// the hashcode is 106079
System.out.println("hashmap object value for key = "+map.get(key));

// Let's store using a key with same hashcode
Integer intkey = new Integer(106079);
val = "value2";
map.put(intkey, val);
System.out.println("hashmap object created. Its intkey hashcode = "+intkey.hashCode());
// this returns 106079 once again. So both key and intkey have the same hashcode

// Let's get the values
System.out.println("hashmap object value for intkey = "+map.get(intkey));
System.out.println("hashmap object value for key = "+map.get(key));

返回值与预期一致。我在幕后读到，HashMap 的工作原理如下：

获取键/值。
从密钥生成哈希码。
将键和值对象存储在存储桶中（在我的情况下存储桶编号为 106079）

第二个也是一样。

将其存储在同一个存储桶中，但由于此时这是一个 LinkedList，我想将其存储在“下一个可用分配”中。

为拿到它，为实现它：

拿起钥匙，获取哈希码，
看水桶，
然后查看 LinkedList 中第一个元素的键，
然后检查键是否传递和元素中的键匹配，如果不匹配，则继续下一个，依此类推，直到可以检索到值。

我是否正确理解了这个概念？

非常感谢！

编辑：

非常感谢，所以完成： - Java HashMap 如何在内部存储条目 - Java HashMap 如何处理具有相同哈希码的不同对象？

和：

不应该有那么多“桶”
存储桶与条目不同。存储桶是共享同一存储桶#的所有条目的逻辑集合（在 Java 案例中是键的 hashCode() 的函数）。如其他答案所述，桶溢出可以通过多种方式实现，不一定在列表中。
还有其他现有的冲突解决方案：http ://en.wikipedia.org/wiki/Hash_table#Collision_resolution
它使用 Object 的 equals 方法来比较和检索同一存储桶中的对象。

score 3 · Accepted Answer

另请注意，HashMap 可以通过多种方式实现哈希码冲突解决，而不仅仅是使用您提到的链表

Java's HashMap implementation does not only use LinkedList strategy to handle key-values with same key.hashCode() values.

Also, you may want to read this article

score 1 · Accepted Answer

是的，你的理解是正确的。请注意，单个存储桶被分配了许多哈希码：在一个新HashMap的存储桶中，总共有 16 个存储桶，每个存储桶总共分配了 2 ³² /16 = 2 ²⁸个哈希码。

score 0 · Accepted Answer

您的理解是可以的，但要考虑到有几种实现方式。HashMap 用于存储值的实际哈希码可能不是 106079。这是一个实现（java-6-openjdk）：

public V put(K key, V value) {
    if (key == null)
        return putForNullKey(value);
    int hash = hash(key.hashCode());
    int i = indexFor(hash, table.length);
    for (Entry<K,V> e = table[i]; e != null; e = e.next) {
        Object k;
        if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
            V oldValue = e.value;
            e.value = value;
            e.recordAccess(this);
            return oldValue;
        }
    }

    modCount++;
    addEntry(hash, key, value, i);
    return null;
}

请注意该hash方法，该方法包括以下内容：

/**
 * Applies a supplemental hash function to a given hashCode, which
 * defends against poor quality hash functions.  This is critical
 * because HashMap uses power-of-two length hash tables, that
 * otherwise encounter collisions for hashCodes that do not differ
 * in lower bits. Note: Null keys always map to hash 0, thus index 0.
 */
static int hash(int h) {
    // This function ensures that hashCodes that differ only by
    // constant multiples at each bit position have a bounded
    // number of collisions (approximately 8 at default load factor).
    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
}

所以在这个 JVM 的例子中，它没有使用 106079 作为散列，因为 HashMap 重新创建了一个散列来“强化”它。

我希望这会有所帮助

java - Hashmap 以及它是如何在幕后工作的

3 回答 3

Related

Reference