Updating dictionary keys and GetHashCode

I just came across an interesting problem looking over a coworker's code. He was calling external code that returned a populated Dictionary. The key of the dictionary is a custom class with simple int properties that implements IEquatable, kind of like a composite key, but only part of the key was populated. He needed all components of the key populated, so he looped through its items and populated those keys.

Later, he attempted to find a value in the dictionary calling Contains/ContainsKey, and it never found it. The keys appeared to match in the debugger; each property of the key class instances was equal. He called GetHashCode in the debugger, and the hash codes matched between what was in the dictionary and the Contains parameter he was looking for. When he replaced the Contains call with a call to Any using the == operator, it found it. But then when he attempted to get the value using the indexer, it couldn't find it. Why could the == operator (and Equals method) find the key, but the Contains method and indexer could not?

As we talked through it and saw in Reference Source how the Dictionary methods were implemented, I realized the problem. When the external code initially populated the Dictionary, the Dictionary internally got the hash code for each key, and used that to store the value in the correct bucket. But when he changed a component of the key, the Dictionary had no way to know the hash code (and thus the bucket it was in) could have changed as well. Even though calling GetHashCode on the updated key returns a matching value, the hash code the Dictionary stores internally is out of date, and doesn't match the key sought. The == operator works because it always looks at the current values of the key's properties, while Contains and the indexer calculate the hash code for the parameter key only, not the key in the dictionary. In this case, it's easier & cleaner to just create a new dictionary with the fully populated keys, so I suggested he change that, and then everything worked as expected.

That's the approach I tend to use anyway unless there are specific resource, performance, or functional reasons to do otherwise, but it's interesting & fun to come across problems like this occasionally that force you to really dig into .NET code you take for granted & see how it's working under the covers. Kudos for Microsoft's .NET team's direction over the last few years towards open source, and making tools like Reference Source available for quick lookups rather than digging into ildasm or 3rd party tools. Were this project on .NET Core, we could look at the code in GitHub directly.

tl;dr When you have a dictionary with composite keys, don't update those keys after adding them to the dictionary or you're asking for trouble.

No Comments