Story of Equality in .Net - Part 2
Introduction
In this post, we will see how smartly .Net handles equality and
comparison out of the box. This means that you will be able to understand how
.Net handles some of the issues that we discussed in the previous post.
If you remember from previous post,there are 4 methods in Object class which are provide by .Net framework for the purpose of equality checking but each one is designed for different scenarios but their purpose is same which is to check equality of two objects. In this post we will be focusing on the following three methods out of the four which are:
We will start by looking in detail at the virtual Object.Equals method. This provides the most important mechanism for equality checking in .Net, since it is the means by which any type can tell what equality means for itself. We will see how out of the box this method gives you reference equality for most reference types but value equality for all value types.
We will start by looking in detail at the virtual Object.Equals method. This provides the most important mechanism for equality checking in .Net, since it is the means by which any type can tell what equality means for itself. We will see how out of the box this method gives you reference equality for most reference types but value equality for all value types.
We will also compare it with the static method Object.Equals()
which is of same name which is more robust when the scenario is that instances
to be checked for equality can be null.
We will also see how we can guarantee that the equality check is done on the reference of instances not the values of the instances using the static Object.ReferenceEquals method.
After reading this post, I hope that you will have a good
understanding what equality means in .Net.
Virtual Object.Equals() Method
As we discussed in the previous post as well, in .Net there
are number of ways to compare equality, but the most fundamental way .Net
provides for this purpose is the virtual
Object.Equals() method defined in the System.Object
type.
To see this method in action, we will create a class which
represents a kind of person. The only thing our class contains is a string
field containing the name of the person and a constructor that enforces to set
the value of the name field.
public class Person { private string _name; public string Name { get { return _name; } } public Person(string name) { _name = name; } public override string ToString() { return _name; } }Now we will write the Main method to use this type.
static void Main(String[] args) { Person p1 = new Person("Ehsan Sajjad"); Person p2 = new Person("Ahsan Sajjad"); Console.WriteLine(p1.Equals(p2)); }
As you can see in Main we are creating two instances of
Person class passing different value as parameter to the constructor, and then
on the next line we are checking for equality using the Object.Equals
method. The equal method returns a Boolean, it returns true if both items are
equals and false if they are not equal, we are writing the result of the
equality comparison on the console to see what it outputs.
You would hope from the code that the two instances are not
equal as “Ehsan Sajjad” and “Ahsan
Sajjad” are not equal they have different sequence of characters, and of
course if we run the code we will see false as output on the console. So Equals()
appears to be working right here and if you notice we didn’t have to do anything
in our Person class definition to
achieve this. The Equals
method is provided by System.Object so it is automatically available on all types,
because all types ultimately derive from System.Object
By the way I suspect some of you may be looking at this code
and thinking that the line p1.Equals(p2)
is not what we write normally for checking equality, if we want to check for
equality, we just write this p1 == p2,
but the point here is we need to understand how equality works in .Net and the
starting point for that is the Equals method.
== Operator and Equals Method
If you write == in
your code, you are using the C# equality operator and that is a very nice
syntactical convenience C# language provides to make the code more readable but
it is not really part of the .Net Framework. .Net has no concept of operators,
it works with methods. If you want to understand how equality works in .Net
then we need to start with the things that .Net Framework understands. So we
will only be using .Net methods in this post for most of the code, so you can
see how it works , that means some of the code
you will see may look un-natural
but we will discuss the == (c# equality operator) in another
post.
Now let’s get back to the fun part i.e. code, we will
declare another instance of Person in the Main program
This new instance p3 also passes same value in the
constructor as p1 which is “Ehsan Sajjad”,
so what do you think what will happen if we try to compare p1 with p3 using the
Equals method, let’s try and see what happens:
static void Main(String[] args) { Person p1 = new Person("Ehsan Sajjad"); Person p2 = new Person("Ahsan Sajjad"); Person p3 = new Person("Ehsan Sajjad"); Console.WriteLine(p1.Equals(p3)); }
This also returns false, these two instances p1 and p3 are
not equal and the reason is the base Object.Equals method evaluates reference
equality, its implementation tests whether two variables refers to the same
instance, in this case it is obvious to us and that p1 and p3 have exactly the
same value, both instances contains the same data but Equals method does not
care about that, it care only about that they are same or different instances,
so it returns false telling us that they are not equal.
As we discussed earlier in this post and in previous post as
well that Equals is a virtual method in Object type which means that we can
override it, if we want the Equals method to compare the values of the Person
instances, we can override the Equals method
and can define our own implementation for how to compare two Person
instances for equality, there is nothing unusual in this, Equals is a normal
virtual method, we are not going to override it yet though, if you want to
stick to good coding practices you need to do few other things when you
override Object.Equals method, we will see later how to do that, for this post
we will just stick to what Microsoft has given us out of the box.
Equals Method Implementation for String
There are couple of reference types for which Microsoft has overridden the Object.Equals method in order to compare values instead of references, probably the well know of these and certainly the one that’s most important to be aware of is String, we will examine with a small program the demonstration of that:static void Main(String[] args) { string s1 = "Ehsan Sajjad"; string s2 = string.Copy(s1); Console.WriteLine(s1.Equals((object)s2)); }
In this program we initialize a string and store it’s
reference in s1, then we create another variable s2 which contains the same
value but we initializer s2 by creating a copy of the value s1 has, string.Copy
method’s name is pretty descriptive, it creates and returns the copy of the
string and then we are using Equals method to compare them.
You can see that we are casting
argument of Equals method explicitly to object
type that obviously you would not want to do in the production code, the
reason we have done that here is to make sure that implementation of override
of Object.Equals() is called, as string define multiple
Equals method and one of
them is strongly type to string i.e. it
takes string as parameter, if we didn’t cast it to object then the
compiler would have considered the strongly typed method a better parameter
when resolving overloads and would have called that one, that is obviously better when we are normally programming and
both method will actually do the same thing
but we explicitly wanted to show how the object.Equals override behaves, so we needed to cast parameter to
object to tell the compiler to avoid strongly typed overload and use the object
type override.
If we run the above code will see
that it returns true. The override provided by Microsoft for string type
compares the contents of the two string instances to check whether they
contains exactly the same characters in the same sequence and returns true if
they are otherwise returns false, even if they are not the same instance.
There are not many Microsoft
defined reference types for which Microsoft has overridden Equals method to
compare values, apart from String type, two others types that you must be aware
of are Delegate
and Tuple,
calling Equals on these two will also compare the values, these are the
exceptional ones all other reference types Equals will always do Reference
equality check.
Equals Method and Value Types
Now we will see how Equals method works for value types, we will be using the same example that we used at start of the post (Person class ) but we will change it to struct instead of class for seeing the behavior in case of value typepublic struct Person { private string _name; public string Name { get { return _name; } } public Person(string name) { _name = name; } public override string ToString() { return _name; } }
What you think what will happen now if we run the same program again, as we know struct is stored on the stack, they don’t have references normally unless we box them, that’s why they are called value type not reference type
static void Main(String[] args) { Person p1 = new Person("Ehsan Sajjad"); Person p2 = new Person("Ahsan Sajjad"); Person p3 = new Person("Ehsan Sajjad"); Console.WriteLine(p1.Equals(p2)); Console.WriteLine(p1.Equals(p3)); }
So as we know that the implementation of Object.Equals do the reference comparison in case of reference types but in this case you might think that comparing references does not makes sense as struct is a value type.
So let’s run the program and see what it prints on the console.
You can see that this time the result is different, for the second case it is saying that both instances are equal, it is exactly the result you would expect if you were to compare the values of both p1 and p3 instances of Person to see if they were equal and that is actually happening in this case, but if we look at the Person type definition we have not added any code for overriding the Equals method of Object class, which means there is nothing written in this type to tell the framework that how to compare the values of instances of Person type to see if they are equal.
.Net already knows all that, it knows how to do that, .Net
framework has figured out without any effort from us that how to tell p1 and p3
have equal values or not, how is that happening. What actually happening is
that as you may already know that all struct types
inherits from System.ValueType
which ultimately derives from System.Object.
System.ValueType
itself overrides the System.Object Equals method, and what the override does is
that it traverses all the fields in the value type and call Equals against each
one until it either finds any field value that is different or all fields are
traversed, if all the fields turn out to be equal, then it figures out that
these two value type instances are equal. In other words, value types override the
Equals method implementation and says that both instances are equal if every
field in them has same value which is quite sensible. In the above example our
Person type has only one field in it which is the backing field for the Name
property which is of type string and we already know that calling Equals on
string compares values and the above results of our program proves what we are
stating here. That’s how .Net provides the behavior of Equals method for value
types very nicely.
Performance Overhead for Value Types
Unfortunately, this convenience provide by .Net framework
comes with a price. The way System.ValueType
Equals implementation works is by using Reflection. Of course it has to if we
think about it. As System.ValueType
is a base type and it does not know about how you will derive from it, so the
only way to find out what fields in out defined type (in this case Person) has
is to do it using Reflection which means that performance would be bad
Recommended Approach for Value Types
The recommend way is to override the Equals method for you
own value types which we will see later how to provide that in a way that it
runs faster, in fact you will see that Microsoft has done that for many of the
built-in value types that comes with the framework which we use every day in
our code.
Static Equals Method
There is one problem when checking for equality using the
virtual Equals method. What will happen if one or both of the references you
want to compare is null. Let’s
see what happens when we call Equals method with null as argument. Let’s modify
the existing example for that:
static void Main(String[] args) { Person p1 = new Person("Ehsan Sajjad"); Console.WriteLine(p1.Equals(null)); }
If we compile this and run, it returns false and it should and makes perfect sense because it is obvious that null is not equal to non-null instance and this is the principle of Equality in .Net that Null should never evaluate as equal to Non-Null value.
Now let’s make it vice versa to see what will happen if the p1 variable is null, then we have a problem. Consider that we don’t have this hardcoded instance creation code instead of that this variable is passed as parameter from some client code which uses this assembly and we don’t know if either of the value is null.
If p1 is passed as null, executing the Equals method call will throw a Null Reference Exception, because you cannot call instance methods against null instances.
The Static Equals method was designed to address this problem, so we can use this method if we are not aware if one of the objects could be null or not, we can use it this way:
Console.WriteLine(object.Equals(p1,null));
Now we are good to go without worrying about if either of the instance reference is null, you can test it by running with both scenarios and you will see it works fine, and of course it would return false if one of the reference variable is null.
Some of you may be wondering that what would happened if both the arguments passed to Static Equals method are null, if you add the following line to test that:
Console.WriteLine(object.Equals(null,null));
You will see that it returns true in this case, in .Net null is always equal to null, so testing whether null is equal to null should always evaluate to true.
If we dig in to the implementation Static Equals method we will find that it is very simple implementation. Following is the logic of it if you look in to the source code of Object type:
public static bool Equals(object x, object y) { if (x == y) // Reference equality only; overloaded operators are ignored { return true; } if (x == null || y == null) // Again, reference checks { return false; } return x.Equals(y); // Safe as we know x != null. }
If first checks if both parameters refer to the same instance
i.e. what == operator will do, this check will evaluate to true causing method
to return true if both the parameters are null, the next if block will return
false if one of the parameters is null and other one is not, finally if control
reaches the else block then we know that both parameter point to some instance,
so we will just call the virtual Equals method.
This means that the static Equals method will always give
the same result as the virtual method
except that it checks for null first, as static method call the virtual
method, if we override the virtual Equals method, our override will
automatically be picked by the static
method, that’s important as we want both static virtual methods to behave
consistently.
ReferenceEquals Method
ReferenceEquals serves a slightly different purpose from the
two Equals method which we have discussed above. It exists for those situations
where we specifically want to determine whether the two variables refer to the
same instance. You may have question in mind that Equals method also checks reference
equality then why a separate method.
Those two methods do check reference equality, but they are
not guaranteed to do so, because the virtual Equals method can be overridden to
compare values of the instance not the reference.
So, ReferenceEquals will give the same result as Equals for
types that don’t have overridden the Equals method. For example, take the
Person class example which we used above. But it’s possible to have different
results for types that have overridden the Equals method. For Example, the
String class.
Let’s modify the string class example that we used earlier
in this post to demonstrate this:
static void Main(String[] args) { string s1 = "Ehsan Sajjad"; string s2 = string.Copy(s1); Console.WriteLine(s1.Equals((object)s2)); Console.WriteLine(ReferenceEquals(s1,s2)); }
If we run this example, we will see that the first Equals call returns true just as before, but the ReferenceEquals method call returns false, and why is that?
It is telling that both string variables are different instances even though they contain the same data and if you recall what we discussed in previous post that String type overrides Equals method to compare the value of two instance not the reference.
You know that in C# static methods cannot be overridden which means you can never change the behavior of ReferenceEquals method which makes sense because it always needs to do the reference comparison.
Summary
- We learned how .Net provides the types equality implementation out of the box
- We saw that few methods are defined by the .Net framework on the Object class which are available for all types .
- By default the virtual Object.Equals method does reference equality for reference types and value equality for value types by using reflection which is a performance overhead for value types.
- Any type can override Object.Equals method to change the logic of how it checks for equality e.g. String, Delegate and Tuple do this for providing value equality, even though these are reference types.
- Object are provides a static Equals method which can be used when there is chance that one or both of the parameters can be null, other than that it behaves identical to the virtual Object.Equals method.
- There is also a static ReferenceEquals method which provides a guaranteed way to check for reference equality.