References
posted on: Apr 19, 2019

Since I started learning to program, one of the most confusing topic has always been how to pass data, without having unintended side effects. Sometimes, we want the data to be immutable, sometimes we do want to modify it within a method. Eric Lippert’s Essential C# does a wonderful job of explaining this topic in depth. This entry tries to summarize my understanding and how it works in C#.

The most fundamental way to pass data in a program is to call a method and pass the data as parameters. Let’s begin with a simple method, which takes two numbers and returns their sum.

public void Add(int x, int y) 
{
  return x + y;
}

int sum = Add(3, 5);

In this example, Add() is a method which takes x and y as parameters, and returns the sum of x and y. We call this method by passing 3 and 5 as arguments and store the returned value in sum.

So far, so good.

In C#, the data to be passed around is usually passed by value, which means that the value of the argument is copied into the target parameter. In our example, 3 is copied to x, and 5 is copied to y.

What that means, is even if we pass an object (a reference type) to a method, it will be passed by value. However, for a reference type, the value to be copied is the memory address, or location where the data associated with the object is stored.

For example, the following method takes an employee object as a parameter. Any changes made to employee will be persisted outside the original method.

public void Promote(Employee employee) 
{
    employee.Salary += 10000;
}

// Initialize an employee with a base salary of 5000
Employee raj = new Employee(5000); 
Console.WriteLine(raj.Salary); // prints 5000

Promote(raj);

Console.WriteLine(raj.Salary); // prints 15000

When we pass raj as an argument to Promote(Employee employee) method, the reference to the employee object raj is copied to the employee variable. In this case, the called method can write a value into the caller’s variable Salary. Hence, the changes made to Salary will be visible outside of Promote() method.

Here is an illustration of how this works step-by-step. When Promote() method returns, raj is holding a reference which points to the Employee object in memory, whose contents (Salary) has been modified by the Promote() method.



Now, because it was passed by value, if we reassign the passed value in Promote() method, the change won’t be reflected in the original variable. For example, if we set employee to null inside Promote(), raj will still be pointing to the original employee.

Promote(Employee employee)
{
    employee = null;
}

// Initialize an employee with a base salary of 5000
Employee raj = new Employee(5000); 
Console.WriteLine(raj.Salary); // prints 5000

Promote(raj);

Console.WriteLine(raj.Salary); // still prints 5000

This is confusing. If we are passing a reference, shouldn’t the changes made to the reference be persisted outside the method? Turns out, since a copy of the caller variable is made, the caller’s variable cannot be reassigned.

It is not of importance whether the parameter passed is a value type or a reference type. Rather, the important issue is whether the called method can write a value into the caller’s original variable.

If a reference type variable is passed by value, the reference itself is copied from the caller to the method parameter. As a result, the target method cannot update the caller variable’s value but it may update the data referred to by the reference.

The above scenario can be illustrated as follows:



Now, if we do want the changes made to the reference itself, C# provides a ref keyword, which allows us to pass the variable as a reference.

Consider the following example, which is similar to the previous one, with a small change. When we call the method Promote(), the parameter is prefixed with the keyword ref. If we assign the parameter to a new Employee object, the argument is changed as well.

Promote(ref Employee employee)
{
    employee = new Employee(6000);
}

// Initialize an employee with a base salary of 5000
Employee raj = new Employee(5000); 
Console.WriteLine(raj.Salary); // prints 5000

Promote(ref raj);

Console.WriteLine(raj.Salary); // prints 6000

When the called method specifies a parameter as ref, it indicates that an argument is passed by reference, not by value. This also means that the parameter essentially becomes an alias for the argument. In other words, any operation on the parameter is made on the argument.

For example, if the caller passes a local variable expression or an array element access expression, and the called method replaces the object to which the ref parameter refers, then the caller’s local variable or the array element now refers to the new object when the method returns.

For using ref, there are following prerequisites:

  • The argument must be a variable
  • The variable must be initialized before being passed.

Sometimes, we do want to pass an argument to a method before initializing it, so that the method can modify it. To achieve this, C# provides another keyword, aptly named out. Prefixing an argument with out is functionally identical to a ref parameter. In both cases the argument is passed as a reference. The only difference is which requirements the C# compiler enforces. They are as follows:

  • Variables passed as out arguments do not have to be initialized before being passed in a method call.
  • However, the out parameter must be set before the method exits.