One of the useful methods provided by LINQ is Distinct(), which enables the elimination of duplicate elements from a collection or sequence. In this article, we will explore the Distinct method in C# and understand how it can be used to simplify data manipulation tasks.

LINQ Distinct Method in C#

The LINQ Distinct method is used to remove duplicate elements from a sequence. It returns a new sequence that contains only the unique elements from the original sequence. The Distinct method has two overloads:

  • Distinct(IEnumerable<T> source): This overload takes a single parameter, which is the sequence that you want to remove the duplicates from.
  • Distinct(IEnumerable<T> source, IEqualityComparer<T> comparer): This overload takes two parameters: the sequence that you want to remove the duplicates from, and an IEqualityComparer<T> object that specifies how to compare the elements in the sequence.

The IEqualityComparer<T> object is used to compare the elements in the sequence to determine if they are duplicates. If the comparer returns true when it compares two elements, then those elements are considered to be duplicates and will be removed from the resulting sequence.

If you do not specify an IEqualityComparer<T> object, then the Distinct method will use the default equality comparer for the type of elements in the sequence. The default equality comparison for most types is case-sensitive.

Example to Understand LINQ Distinct Method on Value Type

Here we have an integer collection that contains duplicate integer values. Our requirement is to remove the duplicate values and return only the distinct values as shown below.

The following example shows how to get the distinct integer values from the data source using both Method and Mixed syntax using LINQ Distinct Extension Method. In Query Syntax, there is no such operator call distinct, so we need to use both Query and Method syntax to achieve the same.

If you want to make the comparison to be case-insensitive then you need to use the other overloaded version of the Distinct Method which takes IEqualityComparer as an argument. Here, you can see, we are passing StringComparer as an argument to the LINQ Distinct method and saying OrdinalIgnoreCase which means please ignore the case sensitive while checking the duplicity.

 

Now, if we go to the definition of StringComparer class, then you can see that this class implements the IEqualityComparer interface as shown below. And this is the reason why we can pass this class as a parameter to the Distinct Method.

LINQ Distinct Operation with Complex Data Type

The LINQ Distinct Method in C# will work in a different manner with complex data types like Employee, Product, Student, etc. Let us understand this with an example. Create a class file with the name Student.cs and then copy and paste the following code into it.

 

Here we created the student class with the two properties i.e. ID and Name. Along the same way, we have also created the GetStudents() method which will return a hard-coded collection of students. So, basically, it is returning the following Student data.

Example to Understand LINQ Distinct Method with Complex Type

Now, our requirement is to fetch all the distinct names from the student’s collection. The following example shows how to use the LINQ Distinct Method to achieve the same using both Method and Query Syntax.

 

In our previous example, we try to retrieve the distinct student names and it works as expected. Now, our requirement is to select distinct students (both ID and Name) from the collection. As you can see in our collection three students are identical and in our result set, they should appear only once. Let us modify the program class as shown below to fetch the distinct student using the LINQ Distinct Method.

 

As you can see, it will not select distinct students rather it select all the students. This is because the default comparer which is used for comparison by LINQ Distinct Method is only checked whether two object references are equal or not and not the individual property values of the complex object.

How to Solve the Above Problem?

We can solve the above problem in four different ways. They are as follows

  1. We need to use the other overloaded version of the Distinct() method which takes the IEqualityComparer interface as an argument. So, here we need to create a class that implements the IEqualityComparer interface and then we need to pass that compare instance to the Distinct() method.
  2. In the second approach, we need to override the Equals() and GetHashCode() methods within the Student class itself.
  3. In the third approach, we need to project the required properties into a new anonymous type, which already overrides the Equals() and GetHashCode() methods
  4. By Implementing IEquatable<T> interface.
Approach1: Implementing IEqualityComparer Interface

So, create a class file with the name StudentComparer.cs and then implement the IEqualityComparer interface and provide the implementation for Equals and GetHashCode Methods as shown in the below code. Here, within the Equals Method, we are comparing the properties values and if the properties values are same, then we need to return true else false. Also, before accessing the values from the object, we need to make sure that the object itself is not null. Within the GetHashCode Method, we are checking the hash value of the Student Object. And whenever we are implementing the Equals Method, we also need to implement the GetHashCode.

 

Now we need to create an instance of StudentComparer class and then we need to pass that instance to the Distinct method. So, modify the Main Method of the Program class as shown in below.

 

With the above changes in place, now run the application and it should display the distinct students as expected as shown in the below image.

Approach2: Overriding Equals() and GetHashCode() Methods within the Student Class

As we already know, by default any type in .NET is inherited from the Object class. That means the Student class is also inherited from the Object class. And, we also know that the Object class provides some virtual methods such as Equals() and GetHashCode(). Now, we need to override the Equals() and GetHashCode() methods of the Object class within the Student class. So, modify the Student class as shown below. Here, we are overriding the Equals() and GetHashCode() methods.

 

With the above changes in the Student class, now modify the Main method of the Program class as shown below. Now, we don’t need to do anything special with the Distinct Method. 

Now execute the above program and it will display the distinct records as expected as shown in the below image.

Approach 3: Using Anonymous Type

In this approach, we need to project the properties of the Student class into a new anonymous type and it will work as expected. The reason is the Annonymous Type already overrides the Equals() and GetHashCode() methods of the Object Class. So, modify the Main Method of the Program class as follows. Here, you can see, using the Select Projection Operator and Select Method, we are projecting the output to an anonymous type.

 

In the above example, we project the ID and Name properties to IEnumeable<’a> means to anonymous type which already overrides the Equals and GetHashCode method. Now run the application and you will see the output as expected as shown in the below image.

Approach 4: Implementing IEquatble<T> Interface in Student Class.

In this approach, we need to implement the IEquatble<T> Interface in Student Class and need to implement the Equals Method of the IEquatble<T> Interface and we also need to override the GetHashCode method of the Object class. So, modify the Student class as shown below.

 

As you can see, here we have done two things. First, we implement the Equals method of the IEquatable interface and then override the GetHashCode method. With the above changes in place, now modify the Main Method of the Program class as shown below.

Difference Between IEqualityComparer<T> and IEquatable<T> in C#:

The IEqualityComparer<T> is an interface for an object that performs the comparison on two objects of the type T whereas the IEquatable<T> is also an interface for an object of type T so that it can compare itself to another.

Leave a Comment