hits counter

LINQ: Enhancing Distinct With The PredicateEqualityComparer

LINQ With C# (Portuguese)

Today I was writing a LINQ query and I needed to select distinct values based on a comparison criteria.

Fortunately, LINQ’s Distinct method allows an equality comparer to be supplied, but, unfortunately, sometimes, this means having to write custom equality comparer.

Because I was going to need more than one equality comparer for this set of tools I was building, I decided to build a generic equality comparer that would just take a custom predicate. Something like this:

public class PredicateEqualityComparer<T> : EqualityComparer<T>
{
private Func<T, T, bool> predicate;

<span style="color: blue">public </span>PredicateEqualityComparer(<span style="color: #2b91af">Func</span>&lt;T, T, <span style="color: blue">bool</span>&gt; predicate)
    : <span style="color: blue">base</span>()
{
    <span style="color: blue">this</span>.predicate = predicate;
}

<span style="color: blue">public override bool </span>Equals(T x, T y)
{
    <span style="color: blue">if </span>(x != <span style="color: blue">null</span>)
    {
        <span style="color: blue">return </span>((y != <span style="color: blue">null</span>) &amp;&amp; <span style="color: blue">this</span>.predicate(x, y));
    }

    <span style="color: blue">if </span>(y != <span style="color: blue">null</span>)
    {
        <span style="color: blue">return false</span>;
    }

    <span style="color: blue">return true</span>;
}

<span style="color: blue">public override int </span>GetHashCode(T obj)
{
    <span style="color: green">// Always return the same value to force the call to IEqualityComparer&lt;T&gt;.Equals</span>
    <span style="color: blue">return </span>0;
}

}

Now I can write code like this:

.Distinct(new PredicateEqualityComparer<Item>((x, y) => x.Field == y.Field))

But I felt that I’d lost all conciseness and expressiveness of LINQ and it doesn’t support anonymous types. So I came up with another Distinct extension method:

public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source, Func<TSource, TSource, bool> predicate)
{
    return source.Distinct(new PredicateEqualityComparer<TSource>(predicate));
}

And the query is now written like this:

.Distinct((x, y) => x.Field == y.Field)

Looks a lot better, doesn’t it? And it works wit anonymous types.

Update: I, accidently, had published the wrong version of the IEqualityComparer<T>.Equals method,

9 Comments

  • Excellent !!! That is very nice !!! Thanks for sharing !!!

  • Nice work Paulo, I was using something similar but making sure I use the hashcode of the object as well.public static IEnumerable&lt;T&gt; Distinct&lt;T&gt;(this IEnumerable&lt;T&gt; source, Func&lt;T, object&gt; keySelector)
    {
    &nbsp; &nbsp;return source.Distinct(new KeyedEqualityComparer&lt;T&gt;(keySelector));
    }
    Then you can do:.Distinct(item =&gt; item.Field)
    instead of:Distinct((x, y) =&gt; x.Field == y.Field)
    Here's the class:public class KeyedEqualityComparer&lt;T&gt; : EqualityComparer&lt;T&gt;
    {
    &nbsp; &nbsp;Func&lt;T, object&gt; _keySelector;
    &nbsp; &nbsp;public KeyedEqualityComparer(Func&lt;T, object&gt; keySelector)
    &nbsp; &nbsp;&nbsp; &nbsp;: base
    {
    _keySelector = keySelector;
    }

    public override bool Equals(T x, T y)
    {
    if( null == x || null == y )
    return null == x &amp;&amp; null == y;

    return object.Equals(
    _keySelector(x),
    _keySelector(y)
    );
    }

    public override int GetHashCode(T obj)
    {
    if(null == obj)
    return 0;

    return _keySelector(obj).GetHashCode();
    }
    }
    One further enhancement would be for a string keySelector, with a StringComparison overload.

  • Dave,

    You spoiled today's post. :)

    I called it SelectorEqualityComparer.

    I find these equality comparers very usefull even outside of LINQ.

  • Paulo,
    I've tried to implement this, but it is failing a unit test because the code is always using the GetHashCode() instead of the overriden Equals() method. &nbsp;So it's never calling the predicate. &nbsp;Did you encounter anything like this? &nbsp;I can see how Dave's solution will work, but what if I want to compare more than one property? &nbsp;The predicate solution is needed in that scenario.
    Thanks,
    Dan

  • Is this just so that you don't have to include your own predicate with a .Where() filter?

  • Dan,

    I had a hard time understanding your comment until I saw I had published the wrong version of the code. Thanks!

    In fact, it's like you mention. The workaround is to always return the same hash code.

  • Robert,

    Where might return more than one item that mathces the predicate.

    Distinct will only return the first that matches the predicate. All subsequent mathces are ignored.

  • @Paulo: Aww, sorry for spoiling the next post

    @Dan: the key selector option still works for multiple properties, you can return any new key or object to base the 'distinct' call on. e.g.

    .Distinct(item => string.Concat(item.Field1, "|", item.Field2))

  • Using the predicate, greatly improves readability, conciseness and expressiveness of the queries, but it can be even better. Most of the times, we don’t want to provide a comparison method but just to extract the comaprison key for the elements.
    So, I developed a SelectorEqualityComparer that takes a method that extracts the key value for each element.
    weblogs.asp.net/.../linq-enhancing-distinct-with-the-selectorequalitycomparer.aspx

Comments have been disabled for this content.