Paulo Morgado

.NET Development & Architecture

Recent Articles

view all

Events

Projects

Recent Readers

Visitor Locations

Visitor Locations

Disclaimer

The opinions and viewpoints expressed in this site are mine and do not necessarily reflect those of Microsoft, my employer or any community that I belong to. Any code or opinions are offered as is. Products or services mentioned are purchased by me, made available to me by my employer or the manufacturer/vendor which doesn't influence my opinion in any way.

LINQ: Enhancing Distinct With The PredicateEqualityComparer

LINQ With C# (Portuguese)

Today I was writing a LINQ query and I needed to select distinct values based on a comparison criteria.

Fortunately, LINQ’s Distinct method allows an equality comparer to be supplied, but, unfortunately, sometimes, this means having to write custom equality comparer.

Because I was going to need more than one equality comparer for this set of tools I was building, I decided to build a generic equality comparer that would just take a custom predicate. Something like this:

public class PredicateEqualityComparer<T> : EqualityComparer<T>
{
    private Func<T, T, bool> predicate;

    public PredicateEqualityComparer(Func<T, T, bool> predicate)
        : base()
    {
        this.predicate = predicate;
    }

    public override bool Equals(T x, T y)
    {
        if (x != null)
        {
            return ((y != null) && this.predicate(x, y));
        }

        if (y != null)
        {
            return false;
        }

        return true;
    }

    public override int GetHashCode(T obj)
    {
        // Always return the same value to force the call to IEqualityComparer<T>.Equals
        return 0;
    }
}

Now I can write code like this:

.Distinct(new PredicateEqualityComparer<Item>((x, y) => x.Field == y.Field))

But I felt that I’d lost all conciseness and expressiveness of LINQ and it doesn’t support anonymous types. So I came up with another Distinct extension method:

public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source, Func<TSource, TSource, bool> predicate)
{
    return source.Distinct(new PredicateEqualityComparer<TSource>(predicate));
}

And the query is now written like this:

.Distinct((x, y) => x.Field == y.Field)

Looks a lot better, doesn’t it? And it works wit anonymous types.

Update: I, accidently, had published the wrong version of the IEqualityComparer<T>.Equals method,

Comments

Raghuraman said:

Excellent !!! That is very nice !!! Thanks for sharing !!!

# April 7, 2010 11:19 PM

Twitter Trackbacks for LINQ: Enhancing Distinct With The PredicateEqualityComparer - Paulo Morgado [asp.net] on Topsy.com said:

Pingback from  Twitter Trackbacks for                 LINQ: Enhancing Distinct With The PredicateEqualityComparer - Paulo Morgado         [asp.net]        on Topsy.com

# April 8, 2010 1:36 AM

Dave Transom said:

Nice work Paulo, I was using something similar but making sure I use the hashcode of the object as well.

public static IEnumerable<T> Distinct<T>(this IEnumerable<T> source, Func<T, object> keySelector)
{
   return source.Distinct(new KeyedEqualityComparer<T>(keySelector));
}

Then you can do:

.Distinct(item => item.Field)

instead of:

Distinct((x, y) => x.Field == y.Field)

Here's the class:

public class KeyedEqualityComparer<T> : EqualityComparer<T>
{
   Func<T, object> _keySelector;
   public KeyedEqualityComparer(Func<T, object> keySelector)
      : base
   {
      _keySelector = keySelector;
   }

   public override bool Equals(T x, T y)
   {
      if( null == x || null == y )
         return null == x && null == y;

      return object.Equals(
         _keySelector(x),
         _keySelector(y)
      );
   }

   public override int GetHashCode(T obj)
   {
      if(null == obj)
         return 0;

      return _keySelector(obj).GetHashCode();
   }
}

One further enhancement would be for a string keySelector, with a StringComparison overload.

# April 8, 2010 7:04 AM

Paulo Morgado said:

Dave,

You spoiled today's post. :)

I called it SelectorEqualityComparer<TSource, TKey>.

I find these equality comparers very usefull even outside of LINQ.

# April 8, 2010 9:52 AM

Dan Jensen said:

Paulo,

I've tried to implement this, but it is failing a unit test because the code is always using the GetHashCode() instead of the overriden Equals() method.  So it's never calling the predicate.  Did you encounter anything like this?  I can see how Dave's solution will work, but what if I want to compare more than one property?  The predicate solution is needed in that scenario.

Thanks,

Dan

# April 8, 2010 10:13 AM

Robert said:

Is this just so that you don't have to include your own predicate with a .Where() filter?

# April 8, 2010 10:53 AM

Paulo Morgado said:

Dan,

I had a hard time understanding your comment until I saw I had published the wrong version of the code. Thanks!

In fact, it's like you mention. The workaround is to always return the same hash code.

# April 8, 2010 5:47 PM

Paulo Morgado said:

Robert,

Where might return more than one item that mathces the predicate.

Distinct will only return the first that matches the predicate. All subsequent mathces are ignored.

# April 8, 2010 5:54 PM

Dave Transom said:

@Paulo: Aww, sorry for spoiling the next post

@Dan: the key selector option still works for multiple properties, you can return any new key or object to base the 'distinct' call on. e.g.

.Distinct(item => string.Concat(item.Field1, "|", item.Field2))

# April 8, 2010 7:26 PM

Paulo Morgado said:

Using the predicate, greatly improves readability, conciseness and expressiveness of the queries, but it can be even better. Most of the times, we don’t want to provide a comparison method but just to extract the comaprison key for the elements.

So, I developed a SelectorEqualityComparer that takes a method that extracts the key value for each element.

weblogs.asp.net/.../linq-enhancing-distinct-with-the-selectorequalitycomparer.aspx

# April 8, 2010 9:43 PM
Leave a Comment

(required) 

(required) 

(optional)

(required)