Udemy

Story of Equality in .Net - Part 2

Sunday, June 26, 2016 0 Comments A+ a-

Introduction

In this post, we will see how smartly .Net handles equality and comparison out of the box. This means that you will be able to understand how .Net handles some of the issues that we discussed in the previous post.

If you remember from previous post,there are 4 methods in Object class which are provide by .Net framework for the purpose of equality checking but each one is designed for different scenarios but their purpose is same which is to check equality of two objects. In this post we will be focusing on the following three methods out of the four which are:
  1. virtual Object.Equals
  2. static Object.Equals
  3. static Object.ReferenceEquals

We will start by looking in detail at the virtual Object.Equals method. This provides the most important mechanism for equality checking in .Net, since it is the means by which any type can tell what equality means for itself. We will see how out of the box this method gives you reference equality for most reference types but value equality for all value types.
We will also compare it with the static method Object.Equals() which is of same name which is more robust when the scenario is that instances to be checked for equality can be null.

We will also see how we can guarantee that the equality check is done on the reference of instances not the values of the instances using the static Object.ReferenceEquals method.

After reading this post, I hope that you will have a good understanding what equality means in .Net.

Virtual Object.Equals() Method

As we discussed in the previous post as well, in .Net there are number of ways to compare equality, but the most fundamental way .Net provides for this purpose is the virtual Object.Equals() method defined in the System.Object type.

To see this method in action, we will create a class which represents a kind of person. The only thing our class contains is a string field containing the name of the person and a constructor that enforces to set the value of the name field.

    public class Person
    {
 
        private string _name;
 
        public string Name 
        { 
            get 
            { 
                return _name; 
            } 
        }
 
        public Person(string name)
        {
            _name = name;
        }
 
        public override string ToString()
        {
            return _name;
        }
 
    }

Now we will write the Main method to use this type.

    static void Main(String[] args)
    {
        Person p1 = new Person("Ehsan Sajjad");
        Person p2 = new Person("Ahsan Sajjad");
            
        Console.WriteLine(p1.Equals(p2));
    }  


As you can see in Main we are creating two instances of Person class passing different value as parameter to the constructor, and then on the next line we are checking for equality using the Object.Equals method. The equal method returns a Boolean, it returns true if both items are equals and false if they are not equal, we are writing the result of the equality comparison on the console to see what it outputs.
You would hope from the code that the two instances are not equal as “Ehsan Sajjad”  and “Ahsan Sajjad” are not equal they have different sequence of characters, and of course if we run the code we will see false as output on the console. So Equals() appears to be working right here and if you notice we didn’t have to do anything in our Person class definition to achieve this. The Equals method is provided by System.Object  so it is automatically available on all types, because all types ultimately derive from System.Object
By the way I suspect some of you may be looking at this code and thinking that the line p1.Equals(p2) is not what we write normally for checking equality, if we want to check for equality, we just write this p1 == p2, but the point here is we need to understand how equality works in .Net and the starting point for that is the Equals method.

== Operator and Equals Method

If you write == in your code, you are using the C# equality operator and that is a very nice syntactical convenience C# language provides to make the code more readable but it is not really part of the .Net Framework. .Net has no concept of operators, it works with methods. If you want to understand how equality works in .Net then we need to start with the things that .Net Framework understands. So we will only be using .Net methods in this post for most of the code, so you can see how it works , that means some of the code  you will see may look un-natural  but we will discuss  the == (c# equality operator) in another post.

Now let’s get back to the fun part i.e. code, we will declare another instance of Person in the Main program

This new instance p3 also passes same value in the constructor as p1 which is “Ehsan Sajjad”, so what do you think what will happen if we try to compare p1 with p3 using the Equals method, let’s try and see what happens:

    static void Main(String[] args)
    {
        Person p1 = new Person("Ehsan Sajjad");
        Person p2 = new Person("Ahsan Sajjad");
        Person p3 = new Person("Ehsan Sajjad");
            
        Console.WriteLine(p1.Equals(p3));
    } 


This also returns false, these two instances p1 and p3 are not equal and the reason is the base Object.Equals method evaluates reference equality, its implementation tests  whether two variables refers to the same instance, in this case it is obvious to us and that p1 and p3 have exactly the same value, both instances contains the same data but Equals method does not care about that, it care only about that they are same or different instances, so it returns false telling us that they are not equal.

As we discussed earlier in this post and in previous post as well that Equals is a virtual method in Object type which means that we can override it, if we want the Equals method to compare the values of the Person instances, we can override the Equals method  and can define our own implementation for how to compare two Person instances for equality, there is nothing unusual in this, Equals is a normal virtual method, we are not going to override it yet though, if you want to stick to good coding practices you need to do few other things when you override Object.Equals method, we will see later how to do that, for this post we will just stick to what Microsoft has given us  out of the box.

Equals Method Implementation for String

There are couple of reference types for which Microsoft has overridden the Object.Equals method in order to compare values instead of references, probably the well know of these and certainly the one that’s most important to be aware of is String, we will examine with a small program the demonstration of that:

    static void Main(String[] args)
    {
        string s1 = "Ehsan Sajjad";
        string s2 = string.Copy(s1);
            
        Console.WriteLine(s1.Equals((object)s2));
    }


In this program we initialize a string and store it’s reference in s1, then we create another variable s2 which contains the same value but we initializer s2 by creating a copy of the value s1 has, string.Copy method’s name is pretty descriptive, it creates and returns the copy of the string and then we are using Equals method to compare them.
You can see that we are casting argument of Equals method explicitly to object type that obviously you would not want to do in the production code, the reason we have done that here is to make sure that implementation of override of Object.Equals() is called, as string define multiple Equals method and one of them is strongly type to string i.e. it  takes string as parameter, if we didn’t cast it to object then the compiler would have considered the strongly typed method a better parameter when resolving overloads and would have called that one, that is obviously  better when we are normally programming and both method will actually do the same thing  but we explicitly wanted to show how the object.Equals override behaves, so we needed to cast parameter to object to tell the compiler to avoid strongly typed overload and use the object type override.

If we run the above code will see that it returns true. The override provided by Microsoft for string type compares the contents of the two string instances to check whether they contains exactly the same characters in the same sequence and returns true if they are otherwise returns false, even if they are not the same instance.

There are not many Microsoft defined reference types for which Microsoft has overridden Equals method to compare values, apart from String type, two others types that you must be aware of are Delegate and Tuple, calling Equals on these two will also compare the values, these are the exceptional ones all other reference types Equals will always do Reference equality check.

Equals Method and Value Types

Now we will see how Equals method works for value types, we will be using the same example that we used at start of the post (Person class ) but we will change it to struct instead of class for seeing the behavior in case of value type

    public struct Person
    {
 
        private string _name;
 
        public string Name 
        { 
            get 
            { 
                return _name; 
            } 
        }
 
        public Person(string name)
        {
            _name = name;
        }
 
        public override string ToString()
        {
            return _name;
        }
 
    }

What you think what will happen now if we run the same program again, as we know struct is stored on  the stack, they don’t have references normally unless we box them, that’s why they are called value type not reference type
    static void Main(String[] args)
    {
        Person p1 = new Person("Ehsan Sajjad");
        Person p2 = new Person("Ahsan Sajjad");
        Person p3 = new Person("Ehsan Sajjad");
            
        Console.WriteLine(p1.Equals(p2));
        Console.WriteLine(p1.Equals(p3));
    }

So as we know that the implementation of Object.Equals do the reference comparison in case of reference types but in this case you might think that comparing references does not makes sense as struct is a value type.

So let’s run the program and see what it prints on the console.


You can see that this time the result is different, for the second case it is saying that both instances are equal, it is exactly the result you would expect if you were to compare the values of both p1 and p3 instances of Person  to see if they were equal and that is actually happening in this case, but if we look at the Person type definition we have not added any code for overriding the Equals method of Object class, which means there is nothing written in this type to tell the framework that how to compare the values of instances of Person type to see if they are equal.


.Net already knows all that, it knows how to do that, .Net framework has figured out without any effort from us that how to tell p1 and p3 have equal values or not, how is that happening. What actually happening is that as you may already know that all struct types inherits from System.ValueType which ultimately derives from System.Object.
System.ValueType itself overrides the System.Object Equals method, and what the override does is that it traverses all the fields in the value type and call Equals against each one until it either finds any field value that is different or all fields are traversed, if all the fields turn out to be equal, then it figures out that these two value type instances are equal. In other words, value types override the Equals method implementation and says that both instances are equal if every field in them has same value which is quite sensible. In the above example our Person type has only one field in it which is the backing field for the Name property which is of type string and we already know that calling Equals on string compares values and the above results of our program proves what we are stating here. That’s how .Net provides the behavior of Equals method for value types very nicely.

Performance Overhead for Value Types

Unfortunately, this convenience provide by .Net framework comes with a price. The way System.ValueType Equals implementation works is by using Reflection. Of course it has to if we think about it. As System.ValueType is a base type and it does not know about how you will derive from it, so the only way to find out what fields in out defined type (in this case Person) has is to do it using Reflection which means that performance would be bad

Recommended Approach for Value Types

The recommend way is to override the Equals method for you own value types which we will see later how to provide that in a way that it runs faster, in fact you will see that Microsoft has done that for many of the built-in value types that comes with the framework which we use every day in our code.

Static Equals Method

There is one problem when checking for equality using the virtual Equals method. What will happen if one or both of the references you want to compare is null. Let’s see what happens when we call Equals method with null as argument. Let’s modify the existing example for that:

    static void Main(String[] args)
    {
        Person p1 = new Person("Ehsan Sajjad");
                        
        Console.WriteLine(p1.Equals(null));
    }

If we compile this and run, it returns false and it should and makes perfect sense because it is obvious that null is not equal to non-null instance and this is the principle of Equality in .Net that Null should never evaluate as equal to Non-Null value.

Now let’s make it vice versa to see what will happen if the p1 variable is null, then we have a problem. Consider that we don’t have this hardcoded instance creation code instead of that this variable is passed as parameter from some client code which uses this assembly and we don’t know if either of the value is null.
If p1 is passed as null, executing the Equals method call will throw a Null Reference Exception, because you cannot call instance methods against null instances.
The Static Equals method was designed to address this problem, so we can use this method if we are not aware if one of the objects could be null or not, we can use it this way:

    Console.WriteLine(object.Equals(p1,null));

Now we are good to go without worrying about if either of the instance reference is null, you can test it by running with both scenarios and you will see it works fine, and of course it would return false if one of the reference variable is null.

Some of you may be wondering that what would happened if both the arguments passed to Static Equals method are null, if you add the following line to test that:

    Console.WriteLine(object.Equals(null,null));

You will see that it returns true in this case, in .Net null is always equal to null, so testing whether null is equal to null  should always evaluate to true.

If we dig in to the implementation Static Equals method we will find that it is very simple implementation. Following is the logic of it if you look in to the source code of Object type:

public static bool Equals(object x, object y)
{
    if (x == y) // Reference equality only; overloaded operators are ignored
    {
        return true;
    }
    if (x == null || y == null) // Again, reference checks
    {
        return false;
    }
    return x.Equals(y); // Safe as we know x != null.
}


If first checks if both parameters refer to the same instance i.e. what == operator will do, this check will evaluate to true causing method to return true if both the parameters are null, the next if block will return false if one of the parameters is null and other one is not, finally if control reaches the else block then we know that both parameter point to some instance, so we will just call the virtual Equals method.
This means that the static Equals method will always give the same result as the virtual method  except that it checks for null first, as static method call the virtual method, if we override the virtual Equals method, our override will automatically be  picked by the static method, that’s important as we want both static virtual methods to behave consistently.

ReferenceEquals Method

ReferenceEquals serves a slightly different purpose from the two Equals method which we have discussed above. It exists for those situations where we specifically want to determine whether the two variables refer to the same instance. You may have question in mind that Equals method also checks reference equality then why a separate method.
Those two methods do check reference equality, but they are not guaranteed to do so, because the virtual Equals method can be overridden to compare values of the instance not the reference.
So, ReferenceEquals will give the same result as Equals for types that don’t have overridden the Equals method. For example, take the Person class example which we used above. But it’s possible to have different results for types that have overridden the Equals method. For Example, the String class.

Let’s modify the string class example that we used earlier in this post to demonstrate this:

    static void Main(String[] args)
    {
        string s1 = "Ehsan Sajjad";
        string s2 = string.Copy(s1);
            
        Console.WriteLine(s1.Equals((object)s2));
        Console.WriteLine(ReferenceEquals(s1,s2));
    }

If we run this example, we will see that the first Equals call returns true just as before, but the ReferenceEquals method call returns false, and why is that?

It is telling that both string variables are different instances even though they contain the same data and if you recall what we discussed in previous post that String type overrides Equals method to compare the value of two instance not the reference.

You know that in C# static methods cannot be overridden which means you can never change the behavior of ReferenceEquals method which makes sense because it always needs to do the reference comparison.

Summary

  • We learned how .Net provides the types equality implementation out of the box
  • We saw that few methods are defined by the .Net framework on the Object class which are available for all types .
  • By default the virtual Object.Equals method does reference equality for reference types and value equality for value types by using reflection which is a performance overhead for value types.
  • Any type can override Object.Equals method to change the logic of how it checks for equality e.g. String, Delegate and Tuple do this for providing value equality, even though these are reference types.
  • Object are provides a static Equals method which can be used when there is chance that one or both of the parameters can be null, other than that it behaves identical to the virtual Object.Equals method.
  • There is also a static ReferenceEquals method which provides a guaranteed way to check for reference equality.

Coursera - Hundreds of Specializations and courses in business, computer science, data science, and more