Overriding .Equals() and .GetHashCode()
Ok, I have been reading some information regarding these 2 methods, and I would like someone to set me straight.
I realize there are a couple of concepts that seem key to understanding this. If your class does not override the Equals method, you are doing an identity check (do 2 objects point to the same heap reference?) It is possible to override the Equals method so that your objects are tested for equivelence in addition to identity.
When I override Equals, I get a compiler warning stating that I should be overriding GetHashCode as well. In the documentation for GetHashCode there are a couple of properties that a hash function should have.
If two objects of the same type represent the same value, the hash function must return the same constant value for either object.
For the best performance, a hash function must generate a random distribution for all input.
The hash function must return exactly the same value regardless of any changes that are made to the object.
I understand the first property, you can use GetHashCode in your overriden Equals method to make it faster (You are short circuiting the logic before going into your deep field by field comparison). I understand the second property because it reduces the possibility of collisions between hashes. I am unclear regarding the 3rd property, but I am assuming that it has to do with HashTables and how they sort based on key hashes. If an object returned a different hashcode after being added to a specific bucket, the hashtable could lose it, leading to inefficiency. However it seems to me that property1 and property3 could collide with each other. Let's assume the following scenario... Class1 has 2 properties called x and y.
Class1 o1 = new Class1();
o1.x = 100;
o1.y = 200;
Class1 o2 = new Class1();
o2.x = 200;
o2.y = 100;
Assuming we have overriden Equals and GetHashCode, these 2 objects should return different hashcodes, based on the rules we defined above. What happens when I change o2.x = 100; o2.y = 200; and call GetHashCode again? Assuming we are using the immutable hashcode rule, both of these objects should still return different hashcodes, even though they are now equivilent. This seems to go against the first property. This is where my disconnect seems to be.
I understand overriding Equals could be useful for some of the stuff we are doing at work, but I am unsure how to proceed with the implementation of the GetHashCode method. Below I have included some code that I have been toying around with, to demonstrate my findings. Thanks to Peter Drayton for his Effective Types presentation|pdf that opened my eyes to this.
The questions I currently am still asking and investigating are...
When should I worry about overriding Equals and GetHashCode?
Should I be doing this on most of my objects?
What is a good GetHashCode algorithm?
(This question is of particular interest to me. I decompiled System.Uri, as an example, and noticed that there is a private readonly static byte[] field called CaseInsensitiveHashCodeTable. The internal static method, CalculateCaseInsensitiveHashCode, seems to be doing some “interesting“ stuff to calculate the hashcode. I don't completely understand what is going on in there, so maybe it is just a lack of knowledge helping me understand this whole thing)
My Findings