A little help with GetHashCode

Implementing the GetHashCode override has proven to be a confusing issue as there are a lot of intricacies to following the rules. There have been plenty of blog posts and other discussions regarding when and how to follow the rules; I don't intend to cover that ground again. Instead, please take a look at some of these posts and discussions:

However, I thought I'd share something that I've found the following useful when implementing GetHashCode for my different types. The following static class encapsulates the algorithm recommended by Jon Skeet in this StackOverflow answer (though it can be found in other places too – it is also the algorithm used by the C# compiler when implementing GetHashCode for anonymous types, for example). This algorithm uses two prime numbers; one for the initial hash and the other as a multiplier when combining a hash code. These prime numbers are used to reduce hash code collisions, something that a basic XOR hash does not do.

My HashCode class extension methods use generics to avoid boxing for value types and also support combining the hash codes of collection elements, if this is needed.

//*****************************************************************************
//
//  Copyright © 2012 Jeff Yates
//
//*****************************************************************************
using System;
using System.Collections.Generic;

namespace SomewhatAbstract
{
    /// <summary>
    /// Provides extensions for producing hash codes to support overrides
    /// of <see cref="System.Object.GetHashCode"/>.
    /// </summary>
    public static class HashCode
    {
        /// <summary>
        /// Provides the initial value for a hash code.
        /// </summary>
        public static readonly int InitialHash = 17;
        private const int Multiplier = 37;

        /// <summary>
        /// Combines the hash code for the given value to an existing hash code
        /// and returns the hash code.
        /// </summary>
        /// <param name="code">The previous hash value.</param>
        /// <param name="value">The value to hash.</param>
        /// <returns>The new hash code.</returns>
        private static int Hash<T>(int code, T value)
        {
            int hash = 0;
            if (value != null)
            {
                hash = value.GetHashCode();
            }

            unchecked
            {
                code = (code * HashCode.Multiplier) + hash;
            }
            return code;
        }

        /// <summary>
        /// Combines the hash codes for the elements in the given sequence with an
        /// existing hash code and returns the hash code.
        /// </summary>
        /// <param name="code">The code.</param>
        /// <param name="sequence">The sequence.</param>
        /// <returns>The new hash code.</returns>
        public static int HashWithContentsOf<T>(this int code, IEnumerable<T> sequence)
        {
            foreach (T element in sequence)
            {
                code = code.HashWith(element);
            }
            return code;
        }

        /// <summary>
        /// Combines the given value's hash code with an existing hash code value.
        /// </summary>
        /// <typeparam name="T">The type of the value being hashed.</typeparam>
        /// <param name="code">The previous hash code.</param>
        /// <param name="value">The value to hash.</param>
        /// <returns>The new hash code.</returns>
        public static int HashWith<T>(this int code, T value)
        {
            return Hash(code, value);
        }
    }
}

Here is an example of how you might use these extension methods to implement an override of GetHashCode. It doesn't necessarily save any typing over just hand-implementing the algorithm, but I find it helps avoid inconsistencies and forgetfulness while giving a little clarity to the code.

public struct MyCustomType
{
    private readonly ReadOnlyCollection myListOfThings;
    private readonly bool aFlagOfSomething;
    private readonly object aThing;

    public override int GetHashCode()
    {
        return HashCode.InitialHash
            .HashWith(this.aFlagOfSomething)
            .HashWith(this.aThing)
            .HashWithContentsOf(this.myListOfThings);
    }
}

Let me know if you find this useful. Perhaps you have a different approach when implementing GetHashCode. If you do, I'd love to hear about it.

4 thoughts on “A little help with GetHashCode”

  1. Nice idea. The only thing I wasn't sure about was having an extension method on int, but thinking about that gave me an idea for an alternative approach. I dashed this off pretty rapidly and haven't tested it, so think of it as more of a sketch.

    [csharp]
    public sealed struct HashCode
    {
    private const int InitialHash = 17;
    private const int Multiplier = 37;

    private int hashValue;
    private HashCode(int value)
    {
    this.hashValue = value;
    }

    public static HashCode Of(T thing)
    {
    return new HashCode(InitialHash).And(thing);
    }

    public HashCode And(T thing)
    {
    int thingHash = (thing != null) ? thing.GetHashCode() : 0;
    return new HashCode(unchecked((this.hashValue * Multiplier) + thingHash));
    }

    public static implicit operator int(HashCode hashCode)
    {
    return hashCode.hashValue;
    }
    }

    public class Example
    {
    public int MyInt { get; set; }
    public string MyString { get; set; }

    public override int GetHashCode()
    {
    return HashCode.Of(MyInt).And(MyString);
    }
    }
    [/csharp]

    1. I like that (though you are missing the this keyword off T thing in the And method declaration.

      After reading your comment and tweet this morning, I thought of a succinct approach using LINQ.

      [csharp]
      public int GetHashCode(params object[] args)
      {
      return args.Aggregate(17, (total, next) => (next == null ? 0 : next.GetHashCode() * 37) + total);
      }
      [/csharp]

      1. Actually the And() method is just a normal instance method, so no this-specifier needed.
        I do enjoy LINQ stuff. The downside here is you lose the type information, so it would be hard to extend to cover IEnumerable values.
        Perhaps your next challenge is to dynamically create the hash function based on reflecting over properties 😉

        1. Of course, you are correct. I wasn't paying attention.

          With the LINQ solution, you definitely lose the ability to include the elements of a sequence, but I imagine there's a way of handling that – perhaps by coupling the LINQ and original approaches. I doubt a solution involving reflection would be the best choice performance-wise, but I bet something involving expression trees and meta-coding would be quite interesting.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.