NHibernate Pitfalls: Sets and Hash Codes

This is part of a series of posts about NHibernate Pitfalls. See the entire collection here.

This is not really an NHibernate-specific problem, but, since a lot of people are having trouble, I decided to mention it.

Sets are probably the most used of all collection types in NHibernate. Up to .NET 4, there was no set implementation in the BCL, so NHibernate historically used Iesi Collections. The set specification says that it doesn’t allow repeated entries, but there are many ways to implement this, and one of the most used ones is Iesi.Collections.Generic.HashedSet<T>, which is also used internally (and automatically) by NHibernate in its NHibernate.Collection.Generic.PersistentGenericSet<T>. Now, this class relies on the hash code of its items and allocates them on an internal structure, where it is assumed that this hash code will never change, which is also consistent with Microsoft’s guidelines. You can see a good description on Eric Lippert’s blog: Guidelines and rules for GetHashCode.

So, what is the problem? The problem is, if changes on your entity cause its hash code to change, the set implementation won’t be able to find the object where it was supposed to be, so it won’t be possible to remove the object from the collection.

Consider this code that illustrates the problem (just for demonstration purposes, of course):

   1: class Test
   2: {
   3:     public Int32 Id
   4:     {
   5:         get;
   6:         set;
   7:     }
   8:  
   9:     private Int32 hashCode = 1;
  10:  
  11:     public override Int32 GetHashCode()
  12:     {
  13:         this.hashCode *= -1;
  14:  
  15:         return (this.hashCode);
  16:     }
  17: }

And some typical usage:

   1: Iesi.Collections.Generic.ISet<Test> col = new Iesi.Collections.Generic.HashedSet<Test>();
   2: col.Add(new Test() { Id = 1 });
   3: col.Add(new Test() { Id = 2 });
   4:  
   5: Test t = col.Last();
   6: Boolean contains = col.Contains(t);
   7: Boolean removed = col.Remove(t);

As you can see for yourself, the Contains and the Remove method may return false, and it will be impossible to remove the item, even though it is there.

There are two ways to go around this:

  • Make sure that your GetHashCode method always returns the same hash, regardless of the entity’s internal state;
  • Make modifications to your entity before adding it to the set, and never again, so that the state-dependent hash code remains the same.

One example of a GetHashCode implementation that never changes could be:

   1: public Int32 Id { get; set; }
   2: public String Name { get; set; }
   3:  
   4: private Int32 hashCode = 0;
   5:  
   6: public override Int32 GetHashCode()
   7: {
   8:     if (this.hashCode != 0)
   9:     {
  10:         return(this.hashCode);
  11:     }
  12:  
  13:     this.hashCode = (result * 397) ^ this.Id.GetHashCode();
  14:     this.hashCode = (result * 397) ^ (this.Name ?? String.Empty).GetHashCode();
  15:  
  16:     return (this.hashCode);
  17: }

As a side note, NHibernate 4.0 will be out in some months, will target .NET 4 and so will make use of the set collections that now exist inside the BCL, namely System.Collections.Generic.ISet<T> and System.Collections.Generic.HashSet<T>, and Iesi Collections 4 have been refactored to use them. The problem, however, remains exactly the same.

                             

No Comments