9

我正在浏览Cormen 等人的算法介绍视频,它讨论了几个散列函数。我想知道Java默认使用什么散列函数?对于用作键的不同类型的对象,散列函数实际上是否不同?Collections 框架中是否有一个 API 可以让我们编写自己的哈希算法?

4

5 回答 5

11

Each object in java has a public int hashCode() method that returns a hash. Each object is free to implement it in its own way by overriding that method. If the method is not overriden, the default Object#hashCode method is used.

You can have look at the source code of various objects to see how it is implemented in the JDK. This is String's hashCode for example (line 1494).

Some collections can add an additional layer of hashing on top of the objects' hashCode methods. For example, HashMap does that to improve performance when an object's hashCode is not well distributed.

于 2012-08-16T16:28:57.333 回答
2

You can always override it in any of your classes... Like

 @override
 public int hashCode()
 { 
 //new implementation 
 }

http://mindprod.com/jgloss/hashcode.html

The default hashCode() method uses the 32-bit internal JVM (Java Virtual Machine) address of the Object as its hashCode.

However, if the Object is moved in memory during garbage collection, the hashCode stays constant. This default hashCode is not very useful, since to look up an Object in a HashMap, you need the exact same key Object by which the key/value pair was originally filed.

Normally, when you go to look up, you don’t have the original key Object itself, just some data for a key. So, unless your key is a String, nearly always you will need to implement a hashCode and equals method on your key class.

Object.hashCode() is a native method.

于 2012-08-16T16:30:40.353 回答
1

It depends on the kind of object that you use. For any object that you implement in your own classes, you can always override the default hashCode() method.

Note, you should always obey the contract between hashCode() and equals() as mentioned in the hashCode() javadoc:

If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.

For more information read this entry.

于 2012-08-16T16:28:35.330 回答
0

Every type in Java has hashCode() method defined, as it's in the Object. hashCode() returns a int. And in HashMap implementation, it hashes the result again and take only the lower bits to make it in the range of 0 to size-1. Note in Sun JDK, size is always 2x, x being some integer.

Java library is open source and you probably have a copy on your dev machine.

In Sun JDK 6, the second hash I mentioned above is

/**
 * Applies a supplemental hash function to a given hashCode, which
 * defends against poor quality hash functions.  This is critical
 * because HashMap uses power-of-two length hash tables, that
 * otherwise encounter collisions for hashCodes that do not differ
 * in lower bits. Note: Null keys always map to hash 0, thus index 0.
 */
static int hash(int h) {
    // This function ensures that hashCodes that differ only by
    // constant multiples at each bit position have a bounded
    // number of collisions (approximately 8 at default load factor).
    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
}

You can find the first hash by looking at the hashCode() function on the class you are interested in.

于 2012-08-16T16:30:50.453 回答
0

Java 中的所有类都继承自java.lang.Object,并且通过这样做,它们继承了hashCode()返回int. 默认方法返回一些(或多或少)由 VM 创建的唯一值(将其视为对象的内存地址,即使这并不完全正确)。当您实现自己的类时,您可以覆盖该方法来做任何您想做的事情。但是,您应该注意您的hashCodeequals方法是一致的,并且您应该知道哈希码通常不是唯一的,因此无论您做什么,都希望不同对象的哈希码之间发生冲突。

Collections 框架通常使用该hashhCode()方法对 Hashtables 等进行哈希处理。可以想象,其他库中的其他数据结构使用显式哈希函数,但在 Collections 框架中不会发生这种情况。

于 2012-08-16T16:33:10.547 回答