c# - 如何识别 GetHashCode 的错误实现？

Question

我有一个 GetHashCode 的实现，我认为它相当健壮，但老实说，我是从互联网的深处挖掘出来的，虽然我理解所写的内容，但我觉得没有资格将其描述为“好” ' 或 GetHashCode 的“错误”实现。

我在 StackOverflow 上做了很多关于 GetHashCode 的阅读。是否有一个示例为什么 Equals/GetHashCode 应该在 NHibernate 中被覆盖？我认为这个线程可能是最好的信息来源，但它仍然让我感到疑惑。

考虑以下实体及其给定的 Equals 和 GetHashCode 实现：

public class Playlist : IAbstractDomainEntity
{
    public Guid Id { get; set; }
    public string Title { get; set; 
    public Stream Stream { get; set; }
    //  Use interfaces so NHibernate can inject with its own collection implementation.
    public IList<PlaylistItem> Items { get; set; }
    public PlaylistItem FirstItem { get; set; }
    public Playlist NextPlaylist { get; set; }
    public Playlist PreviousPlaylist { get; set; }

    private int? _oldHashCode;
    public override int GetHashCode()
    {
        // Once we have a hash code we'll never change it
        if (_oldHashCode.HasValue)
            return _oldHashCode.Value;

        bool thisIsTransient = Equals(Id, Guid.Empty);

        // When this instance is transient, we use the base GetHashCode()
        // and remember it, so an instance can NEVER change its hash code.
        if (thisIsTransient)
        {
            _oldHashCode = base.GetHashCode();
            return _oldHashCode.Value;
        }
        return Id.GetHashCode();
    }

    public override bool Equals(object obj)
    {
        Playlist other = obj as Playlist;
        if (other == null)
            return false;

        // handle the case of comparing two NEW objects
        bool otherIsTransient = Equals(other.Id, Guid.Empty);
        bool thisIsTransient = Equals(Id, Guid.Empty);
        if (otherIsTransient && thisIsTransient)
            return ReferenceEquals(other, this);

        return other.Id.Equals(Id);
    }
}

在这个实现中吹捧的安全检查数量似乎超过了顶部。它激发了我的信心——假设写这篇文章的人比我理解更多的极端案例——但也让我想知道为什么我看到这么多简单的实现。

为什么在重写 Equals 方法时重写 GetHashCode 很重要？查看所有这些不同的实现。下面是一个简单但评价很高的实现：

  public override int GetHashCode()
  {
        return string.Format("{0}_{1}_{2}", prop1, prop2, prop3).GetHashCode();
  }

这个实现会比我提供的更好还是更差？为什么？

两者都同样有效吗？实施 GetHashCode 时是否应遵循标准“指南”？上面的实现有什么明显的缺陷吗？如何创建测试用例来验证 GetHashCode 的实现？

score 2 · Accepted Answer

GetHashCode should match concept of "equal" for your classes/environment (in addition to be constant while in a container and fast).

In normal cases "equal" is comparing all fields of corresponding objects (value type comparison). In this case simple implementation that somehow merges hash codes of all fields will suffice.

My understanding that in NHibernate's case "equal" is significantly more tricky and as result you see complicated implementation.I believe it is mainly due to the fact that some object properties may not be yet available - in such case comparing "identity" of object is enough.

score 1 · Accepted Answer

It's unfortunate that while there two different questions that an equality-test method could meaningfully ask of any pair of object references X and Y, there's only one Equals method and one GetHashCode method.

Assuming X and Y are of the same type(*), will all members of X always behave the same as corresponding methods of Y? Two references to different arrays would be reported as unequal under this definition even if they contain matching elements, since even if their elements are the same at one moment in time, that may not always be true.
Assuming X and Y are of the same type(*), would simultaneously replacing all references to object X with references to object Y, and vice versa, affect any members of either other than an identity-based GetHashCode function? References to two distinct arrays whose elements match would be reported as equal under this definition.

(*) In general, objects of different types should report unequal. There might be some cases where one could argue that objects of different private classes which are inherited from the same public class should be considered equal, if all of the code which has access to the private classes only stores reference in the matching public type, but that would be at most a pretty narrow exception.

Some situations require asking the first question, and some require asking the second; the default Object implementation of Equals and GetHashCode answer the first, while the default ValueType implementations answer the second. Unfortunately, the selection of which comparison method is appropriate for a given reference is a function of how the reference is used, rather than a function of the referred-to instance's type. If two objects hold references to collections which they will neither mutate nor expose to code that might do so, for the intention of encapsulating the contents thereof, equality of the objects holding those references should depend upon the contents of the collections, rather than their identity.

It looks as though code is sometimes using instances of type PlayList in ways where the first question is more appropriate, and sometimes in ways where the second would be more appropriate. While that may be workable, I think it might be better to have a common data-holder object which can be wrapped if necessary by an object whose equality-check method would be suitable for one use or the other (e.g. have a PlaylistData object which can be wrapped by either a MutablePlaylist or an ImmutablePlaylist). The wrapper classes could have InvalidateAndMakeImmutable or InvalidateAndMakeMutable methods which would invalidate the wrapper and return a new wrapper around the object (using wrappers would ensure that the system would know whether a given Playlist reference could be exposed to code that might mutate it).

c# - 如何识别 GetHashCode 的错误实现？

2 回答 2

Related

Reference