Click here to Skip to main content
15,867,308 members
Articles / Programming Languages / C# 4.0

A Generic IEqualityComparer for Linq Distinct()

Rate me:
Please Sign up or sign in to vote.
4.83/5 (18 votes)
15 Jul 2010CPOL3 min read 188.7K   714   32   23
An implementation of IEqualityComparer that can be used to compare any class by one of its properties

Introduction

I love Linq and I find myself using it more and more, but I am always mildly annoyed everytime I (re)discover that I can’t do a Distinct filter on a property of the class in my collection. For example, if I have a list of Contact objects and I want to extract from that list a distinct list of Contacts based on their email address. The parameter-less Distinct() method will compare a Contact object based on the default equality comparer, but there is no quick way to specify that I want to compare them based on email address. This article describes a generic implementation of an IEqualityComparer that can be used by Distinct() to compare any class based on a property of that class.

Background

This article assumes that you have a general understanding of LINQ extensions for .NET collections. Also, bear in mind here that this article is discussing Linq operating on in-memory objects, not Linq to SQL or Linq to Entities or anything else like that.

The Problem

First, let's look at our sample Contact class:

C#
public class Contact
{
    public string Name {get; set;}
    public string EmailAddress { get; set; }
}

Nothing fancy there, just a class with some basic properties. And the problem we want to solve is that if we have a list of Contact objects where some contacts have the same email address, we want to get just a distinct list of email addresses by doing something like this:

C#
IEnumerable<Contact> collection = //retrieve list of Contacts here
IEnumerable<Contact> distinctEmails = collection.Distinct();

But if we do this, Distinct will compare Contact objects based on the default equality comparer which will compare them by reference. In this case, Distinct will return all of the Contacts in our original collection (assuming they are all unique instances).

Solution 1: Override Default Equality Comparer

One solution to get Linq operate on the EmailAddress property would be to override the Equals and GetHashCode methods for the Contact class and have it use the EmailAddress property of the Contact. This would cause the parameter-less Distinct() method to use your override. Besides the fact that this method has subtle complications that make it tricky, you might not always want to compare Contact objects based on EmailAddress. You might also sometimes compare them based on Name. So the Equals operator may not be the best solution.

Solution 2: Implement IEqualityComparer<Contact>

The Distinct() method also has an overload which allows you to specify an IEqualityComparer implementation. So, another solution is to write a class that implements IEqualityComparer<Contact> and performs the comparison based on the EmailAddress property.

To do this, we have to create our comparer class:

C#
class ContactEmailComparer : IEqualityComparer<Contact>
{
    #region IEqualityComparer<Contact> Members

    public bool Equals(Contact x, Contact y)
    {
        return x.EmailAddress.Equals(y.EmailAddress);
    }

    public int GetHashCode(Contact obj)
    {
        return obj.EmailAddress.GetHashCode();
    }

    #endregion
}
C#
IEqualityComparer<Contact> customComparer = new ContactEmailComparer();
IEnumerable<Contact> distinctEmails = collection.Distinct(customComparer); 

This will cause the Distinct() method to compare our objects based our custom Equals implementation which uses the EmailAddress property of the Contact.

A Generic Solution

The implementation of the ContactEmailComparer is pretty trivial, but it does seem like a lot of work just to get a distinct list of email addresses.

A more universal solution is to write a generic class where you can tell it which property of your objects to compare on. We will extend our IEqualityComparer to use reflection to extract the value of a specified property, rather than restricting our class to one property.

Here is an implementation of such a class:

C#
public class PropertyComparer<T> : IEqualityComparer<T>
{
    private PropertyInfo _PropertyInfo;
    
    /// <summary>
    /// Creates a new instance of PropertyComparer.
    /// </summary>
    /// <param name="propertyName">The name of the property on type T 
    /// to perform the comparison on.</param>
    public PropertyComparer(string propertyName)
    {
        //store a reference to the property info object for use during the comparison
        _PropertyInfo = typeof(T).GetProperty(propertyName, 
	BindingFlags.GetProperty | BindingFlags.Instance | BindingFlags.Public);
        if (_PropertyInfo == null)
        {
            throw new ArgumentException(string.Format("{0} 
		is not a property of type {1}.", propertyName, typeof(T)));
        }
    }
    
    #region IEqualityComparer<T> Members
    
    public bool Equals(T x, T y)
    {
        //get the current value of the comparison property of x and of y
        object xValue = _PropertyInfo.GetValue(x, null);
        object yValue = _PropertyInfo.GetValue(y, null);
        
        //if the xValue is null then we consider them equal if and only if yValue is null
        if (xValue == null)
            return yValue == null;
            
        //use the default comparer for whatever type the comparison property is.
        return xValue.Equals(yValue);
    }
    
    public int GetHashCode(T obj)
    {
        //get the value of the comparison property out of obj
        object propertyValue = _PropertyInfo.GetValue(obj, null);
        
        if (propertyValue == null)
            return 0;
            
        else
            return propertyValue.GetHashCode();
    }
    
    #endregion
}  

Now, to get our distinct list of email addresses, we do this:

C#
IEqualityComparer<Contact> customComparer =
                   new PropertyComparer<Contact>("EmailAddress");
IEnumerable<Contact> distinctEmails = collection.Distinct(customComparer);

The best part about this solution is that it will work for any property and any type, so instead of writing a custom IEqualityComparer, we can just reuse our generic PropertyComparer.

For example, with no extra work, we can also get a distinct list of Contacts by name by doing this:

C#
IEqualityComparer<Contact> customComparer =  new PropertyComparer<Contact>("Name");
  IEnumerable<Contact> distinctEmails = collection.Distinct(customComparer); 

Enhancements

Currently, this implementation only works for public properties on a class. It would be easy to extend it to also inspect public fields which would be a useful feature.

Conclusion

There is really nothing very special about this code. It is just a generic implementation of IEqualityComparer that takes a string specifying a property name in its constructor. But performing a Distinct filter on a property is something I always feel like ought to be really easy but turns out to be sort of a pain. This class makes it a little easier, I hope you find it useful.

History

  • 15th July, 2010: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer Top Side Software
United States United States
Seth Dingwell is the Owner and Principal Architect of Top Side Software in Charlotte, NC.

One of Seth's passions is building cutting edge software with C# and whatever the latest and greatest technologies are.

When he's not developing software, Seth enjoys hiking, building wooden boats, and spending time with his wife Andrea and daughters Sienna and Annabelle.

Comments and Discussions

 
QuestionHow would you do it for multiple properties ? Pin
bobbyjose00728-May-20 7:00
bobbyjose00728-May-20 7:00 
GeneralMy vote of 5 Pin
Iftikhar Akram7-Sep-17 4:40
Iftikhar Akram7-Sep-17 4:40 
Question'IEqualityComparer' could not be found Pin
super_user27-Jul-16 0:13
super_user27-Jul-16 0:13 
QuestionAnother solution Pin
nandodixtorsion2-May-13 5:12
nandodixtorsion2-May-13 5:12 
SuggestionRe: Another solution Pin
mike_vk16-Aug-13 1:37
mike_vk16-Aug-13 1:37 
QuestionThanks for posting this Pin
Jan Roggisch10-Dec-12 10:31
Jan Roggisch10-Dec-12 10:31 
GeneralMy vote of 5 Pin
Oofpez14-Nov-12 1:23
Oofpez14-Nov-12 1:23 
GeneralMy vote of 5 Pin
Marc Chouteau6-Nov-12 1:34
Marc Chouteau6-Nov-12 1:34 
GeneralMy vote of 5 Pin
Abhinav S31-Jul-12 0:51
Abhinav S31-Jul-12 0:51 
GeneralA distinct extension method that takes a lamda :-) Pin
Daniel Richardson12-Jun-12 21:43
Daniel Richardson12-Jun-12 21:43 
C#
public static IEnumerable<TSource> Distinct<TSource, TResult>(this IEnumerable<TSource> items, Func<TSource, TResult> selector)
       {
           var set = new HashSet<TResult>();
           foreach (var item in items)
           {
               var hash = selector(item);
               if (!set.Contains(hash))
               {
                   set.Add(hash);
                   yield return item;
               }
           }
       }


usage...
XML
IEnumerable<Contact> collection = //retrieve list of Contacts here
IEnumerable<Contact> distinctEmails = collection.Distinct(c => c.EmailAddress);

GeneralMy vote of 5 Pin
Denzel9-May-12 9:18
Denzel9-May-12 9:18 
QuestionA simpler solution... Pin
C# Genius14-Sep-11 18:41
C# Genius14-Sep-11 18:41 
GeneralMy vote of 5 Pin
Thad Tilton14-Aug-11 11:08
Thad Tilton14-Aug-11 11:08 
GeneralUsing Func to project the values. Pin
Marc Brooks26-Jul-10 8:09
Marc Brooks26-Jul-10 8:09 
GeneralAnother alternative Pin
Richard Deeming20-Jul-10 7:59
mveRichard Deeming20-Jul-10 7:59 
GeneralRe: Another alternative Pin
jwooley26-Jul-10 5:57
jwooley26-Jul-10 5:57 
GeneralRe: Another alternative Pin
Richard Deeming26-Jul-10 6:10
mveRichard Deeming26-Jul-10 6:10 
GeneralGood work - another way [modified] Pin
tonyt16-Jul-10 8:30
tonyt16-Jul-10 8:30 
GeneralRe: Good work - another way Pin
MR_SAM_PIPER19-Jul-10 14:36
MR_SAM_PIPER19-Jul-10 14:36 
GeneralRe: Good work - another way [modified] Pin
tonyt19-Jul-10 19:14
tonyt19-Jul-10 19:14 
GeneralPrecise and Useful Pin
santosh poojari16-Jul-10 3:00
santosh poojari16-Jul-10 3:00 
GeneralReducing reflection overhead Pin
Paul C Smith15-Jul-10 11:24
Paul C Smith15-Jul-10 11:24 
GeneralRe: Reducing reflection overhead Pin
Seth Dingwell16-Jul-10 3:18
Seth Dingwell16-Jul-10 3:18 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.