2

I have been experimenting with the following implementation of equals & hashCode:

@Override
public boolean equals(Object obj) {
    return obj != null && hashCode() == obj.hashCode();
}

@Override
public int hashCode() {
    return new HashCodeBuilder().append(myField1).append(myField2).toHashCode();
}

Basically, I am expecting equals to return true for any 2 classes having the same hashCode, which comes down to the values of the fields I am using to generate my hashCode.

I am aware that this also returns true for different classes that happen to have equal values in these fields.

Question: What are pitfalls of such an implementation?

4

3 回答 3

8

Hash collisions. Instances with different field values may have matching hashcodes, and therefore compare equal. I'm not sure why this would be useful.

于 2013-06-11T13:27:39.613 回答
1

As Oli said, you will ensure that 2 objects with the same data match, but also a non matching objects with the same hashCode will do, remember you use hashcode to order elements in hash tables to optimize the sorting not for comparing, for ensure the equals methods you should compare your delicate data for the object like:

public boolean equals(Object obj) {
    if (obj instanceof THISOBJECT) {
        THISOBJECT other = (THISOBJECT) obj;
        return getID.equals(other.getID);
    }
    return false;
}
于 2013-06-11T14:53:59.903 回答
-1

If testing two objects for equality would be expensive, and if the hash codes of the objects are known, it may be helpful to test the hash codes as a first step toward testing equality. If the hash codes are not equal, there's no need to look any further. If they are equal, then examine things in more detail. Suppose, for example, that one had many 100,000-character strings which happened to differed only in the last ten characters (but there was no reason to expect that to be the case). Even if there were a 1% false match rate with hash codes, checking hash codes before checking the string contents in detail could offer a nearly-100-fold speedup versus repeatedly having to examining the first 9,990 characters of every strings.

The goal of a hash code is generally not to be unique, but rather to reduce the cost of comparisons involving false hash matches to be in the same ballpark as the cost of hash code computations. If a given hash code generates so many false matches that the time spent processing those dominates the time computing the hash codes, then spending more time computing hash codes may be worthwhile if it can reduce the number of false matches. If the hash algorithm is so effective that the time spent computing hash codes dominates the time spent on false matches, it may be better to use a faster hashing algorithm even if the number of false matches would increase.

于 2013-06-11T18:46:08.137 回答