Lambda Expressions: A C# 3.0 Language Enhancement

logicchild

Rate me:

4.56/5 (9 votes)

13 Jul 2009CPOL12 min read

28.6K

An article that describes the C# 3.0 Lambda Expression feature

Introduction

If you plan on using LINQ, then you will definitely need to understand Lambda expressions. Lambda expressions are a sort of short hand for anonymous delegates. That is, a Lambda expression is an unnamed method written in place of a delegate instance. Anonymous methods are a C# feature that has been subsumed by C# 3.0 lambda expressions. The C# compiler immediately converts the lambda expression to either:

A delegate instance
An expression tree, of type Expression<t>, representing the code inside the lambda expression in a traversable object model. This allows the lambda expression to be interpreted later at runtime.

This article will focus on bringing lambda expressions into sharper focus by first covering delegates, anonymous methods, and writing code that uses lambda expressions to build query expressions. It will not cover the expression tree. It is essential, however, to first either understand (or get a tune up in) delegates. And while this article is meant for the beginner, it does assume knowledge of generics. So to start, let’s just say that a delegate dynamically wires up a method caller to its target method. A delegate type defines a protocol to which the caller and target will conform, comprising a list of parameter types and a return type. So what does that mean? Let’s start by defining, and then examining, delegates.

Delegates

Delegates are special types that represent a strongly typed method signature. Delegate types are derivatives of the special System.Delegate type, which itself derives from the System.ValueType (which indirectly derives from System.Object). A delegate can be instantiated and formed over any target method and instance combination where the method matches the delegate’s signature. In C#, a new delegate type is created using the delegate keyword:

public delegate void MyDelegate(int x, int y);

This says that we create a new delegate type, called MyDelegate, which can be constructed over methods with void return types and that accept two arguments each typed as int. So, a delegate instance literally acts as delegate for the caller: the caller invokes the delegate, and then the delegate calls the target method. Again, a delegate type declaration is preceded by the keyword delegate. While this does not seem to add up, consider these code examples below:

using System;
class App {
private delegate string GrabString();
public static void Main(string[]  args) {
int x = 40;
GrabString MyStringMethod = new GrabString(x.ToString);
Console.WriteLine("String is" + MyStringMethod());
   }
}

The output states ‘String is 40’. With the MyStringMethod initialized tox.ToString(), the last statement is equivalent to saying:

Console.WriteLine("String is " + x.ToString());

In this code, you instantiate a delegate of type GrabString, and you initialize it so that it refers to the ToString() method of the variable x. Delegates in C# always syntactically take a one-parameter constructor, the parameter being the method to which the delegate will refer. This method must match the signature with which you originally defined the delegate. This method’s return values must also match that of the originally defined delegate. Notice that because int.ToString() is an instance method (as opposed to a static method), you need to specify the instance (x) as well as the name of the function to initialize the delegate. Now consider these other examples:

using System;
delegate int Transformer (int x);
// to create a delegate instance, you can assign a method to a delegate variable
public class Program {
public static void Main() {
 Transformer t = Square;
 // the above creates a delegate instance
 int result = t(3); // this invokes the delegate
 Console.WriteLine(result);
 }
private static int Square (int x ) { return x * x;  }
}

The output is 9. Invoking a delegate is just like invoking a method (since the delegate’s purpose is merely to provide a level of indirection):

t(3);

This statement:

Transformer t = Square;

is shorthand for:

Transformer t = Square;

On that note, consider the code below:

  using System;
  public delegate void MyDelegate(int x, int y);
  public sealed class Program {
  static void PrintPair(int a, int b)
  {
      Console.WriteLine("a = {0}", a);
      Console.WriteLine("b = {0}", b);
  }
   public static void Main()
  {
      // Implied 'new MyDelegate(this.PrintPair)':
      MyDelegate del = PrintPair;
      // Implied 'Invoke':
      del(10, 20);
  }
}

The output for that code above is:

a = 10
b = 20

Anonymous Delegates

Anonymous delegates are a feature of the C# 2.0 language, but not the Common Language Runtime itself. The code above used named methods, methods that we would call out to. Since we have seen that delegates permit you to pass method pointers as arguments to other methods, it should be preferable to simply write your block of code inline rather than having to set up another method by hand. Anonymous delegates permit you to do this. Consider a method that takes a delegate and applies it a number of times:

delegate int IntIntDelegate(int x);
void TransformUpTo(IntIntDelegate d, int max)
 {
    for ( int I = 0; I <= max; i++)
  Console.WriteLine(d(i));
  }

If we wanted to pass a function to TransformUpTo that squared the input, we'd have to first write an entirely separate method over which we'd form a delegate. But anonymous delegates accomplish the same thing by an anonymous method, which is defined inline. In other words, in the actual line of code where it is referenced or needed, we avoid calling out to a named method. Rather, we take the body of the method and place it in the line of code where we actually make the function call:

TransformUpTo (delegate(int x) {return x * x; }, 10;

In other words, an anonymous method is a block of code that is used as a parameter for the delegate. The example below should make clear the distinction between using named functions and that of using anonymous methods. The example defines a MathOperations class that has a couple of static methods to perform two operations on doubles. Then you use delegates to call up these methods. The math class looks like this:

class MathsOperations
  {
    public static double MultiplyByTwo(double value)
    {
      return value*2;
    }

    public static double Square(double value)
    {
      return value*value;
    }
  }

You call up these methods like this:

using System;
delegate double DoubleOp(double x);
class MainEntryPoint
  {
    static void Main()
    {
      DoubleOp [] operations = 
            {
              new DoubleOp(MathsOperations.MultiplyByTwo),
              new DoubleOp(MathsOperations.Square)
            };

      for (int i=0 ; i < operations.Length ; i++)
      {
        Console.WriteLine("Using operations[{0}]:", i);
        ProcessAndDisplayNumber(operations[i], 2.0);
        ProcessAndDisplayNumber(operations[i], 7.94);
        ProcessAndDisplayNumber(operations[i], 1.414);
        Console.WriteLine();
      }
      Console.ReadLine();
    }

    static void ProcessAndDisplayNumber(DoubleOp action, double value)
    {
      double result = action(value);
      Console.WriteLine("Value is {0}, result of operation is {1}", value, result);
    }
  }

In this code, you instantiate an array of DoubleOp delegates. Each element of the array gets initialized to refer to a different operation implemented by the MathOperations class. Then, you loop though the array, applying each operation to three different values. This illustrates one way of using delegates – that you group methods together into an array using them, so that you call several methods in a loop. The key lines in this code are the ones in which you actually pass each delegate to the ProcessAndDisplayNumber() method, for example:

ProcessAndDisplayNumber(operations[i], 2.0);

Here you are passing in the name of the delegate but without any parameters. Given that operations[i] is a delegate, syntactically:

operations[i] means the delegate (that is, the method represented by the delegate)
operations[I] (2.0) means actually call this method, passing in the value in parenthesis

The ProcessAndDisplayNumber() method is defined to take a delegate as its first parameter:

static void ProcessAndDisplayNumber(DoubleOp action, double value)

Then, when in this method, you call:

double result = action(value);

Running this code outputs the following:

Using operations[0]:
Value is 2, result of operation is 4
Value is 7.94, result of operation is 15.88
Value is 1.414, result of operation is 2.828

Using operations[1]:
Value is 2, result of operation is 4
Value is 7.94, result of operation is 63.0436
Value is 1.414, result of operation is 1.999396

Now if anonymous methods were used, the code would look like this, but yield the same result:

using System;
using System.Collections.Generic;
using System.Text;
public sealed class Program
  {
    delegate double DoubleOp(double x);

   public static void Main(string[] args)
    {
      DoubleOp multByTwo = delegate(double val) {return val * 2;};
      DoubleOp square = delegate(double val) { return val * val; };

      DoubleOp [] operations = {multByTwo, square};

      for (int i=0 ; i < operations.Length ; i++)
      {
        Console.WriteLine("Using operations[{0}]:", i);
        ProcessAndDisplayNumber(operations[i], 2.0);
        ProcessAndDisplayNumber(operations[i], 7.94);
        ProcessAndDisplayNumber(operations[i], 1.414);
        Console.WriteLine();
      }

    }

    static void ProcessAndDisplayNumber(DoubleOp action, double value)
    {
      double result = action(value);
      Console.WriteLine(
         "Value is {0}, result of operation is {1}", value, result);
    }
  }

Now that anonymous methods were used in this example, the first class, MathOperations, could be completely eliminated. The main method thus appears as in the example above. What has happened is that we defined an unnamed method inline; the body of the method is placed in the same line of code used to make that actual function call. We did not have to make a call to the method or make a reference to it, we merely just placed it inline. More to the point, we are passing in an instance of a delegate that accepts a parameter of the same type. A lambda expression (C# 3.0) is an unnamed method written in place of a delegate instance. As stated earlier, the compiler immediately converts the lambda expression to either a delegate instance or an expression tree of type Expression. In the following example, square is assigned the lambda expression x => x * x:

using System;
delegate int Transformer ( int i );
 public sealed class Program {
 public static void Main() {
 Transformer square = x => x * x;
 Console.WriteLine( square(3));
   }
}

The output is 9, obviously. But the above example could be rewritten by converting the lambda expression into a method, and then call the method through the delegate. Again, the compiler performs the translation for you when you assign a delegate a lambda expression:

using System;
delegate int Transformer ( int i );
 public sealed class Program {
 public static void Main() {
 Transformer square = Square;
Console.WriteLine(square(3));
 }
private static int Square ( int x ) { return x * x; }
}

Notice that the syntax is similar to that of the delegate example that used an anonymous delegate, or an unnamed method. Also, recall that anonymous delegates (C# 2.0) are language feature of C#, but not of the Common Language Runtime. This is an important distinction to note about the new language features that came with .NET 2.0 and that of C# 3.0. Partial classes, generics, and anonymous methods were additional language enhancements that came with the introduction of the .NET 2.0 programming platform. But, unlike generics or partial classes, anonymous methods do not involve new IL instructions. All of the work happens at the level of the compiler.

A Word about LINQ

LINQ, or Language Integrated Query, is a set of C# 3.0 language enhancements and .NET Framework features for writing structured type-safe queries over local object collections and remote data sources. As anyone familiar with C# knows, collections start where arrays leave off. They are classes used for grouping and managing related objects. Collections permit you to store, look up, and iterate over collections of objects. The basic units of organization, or data, in LINQ, are sequences and elements. A sequence is any object that implements the generic IEnumerable interface and an element is each item in the sequence. Consider this example:

using System;
using System.Collections.Generic;
using System.Linq;
public class Example {
public static void Main() {
 string [] names = { "JumboDee", "Butler", "Mitch" };
 IEnumerable<string><string /> filteredNames = names.Where ( n => n.Length >= 6);
 foreach (string name in filteredNames)
 Console.Write(name + "|");
   }
 }

So we began by declaring and initializing an array of strings, each of which represents a name. Since the general concept of an enumerator is that of a type whose sole purpose is to advance through and read another collection’s contents, then enumerators do not provide write capabilities. In generics, the IEnumerable<T> represents a type whose contents can be enumerated. If something is being enumerated, or read through one at a time, then we can say that it is a traversal process. We can also say that it is an inquiry, or a query. A query is an expression that transforms sequences with query operators. The simplest query comprises one sequence and one operator. In the example above, we applied the Where operator (System.Linq.Where) on an array of strings to extract those whose length is at least 6 characters. With this filter, the output is:

JumboDee|Butler|

Most query operators accept lambda expressions as an argument. The lambda expression helps guide and shape the query. In our example, the lambda expression was:

n => n.Length >= 6

We passed a parameter of type string to implement the IEnumerable<T> interface similar to the way we passed a lambda expression to a query operator. The lambda expression n => n.Length >= 4 illustrates that the input argument corresponds to an input element. In our example, n represents each name in the array and is of type string. The Where operator requires that the lambda expression return a bool value, which if true, indicates that the element should be included in the output sequence. An expression that returns a bool value is called a predicate. Here is the signature, or prototype:

public static IEnumerable<TSource> Where<TSource>
(this IEnumerable<TSource> source, Func<TSource,bool>  predicate)

The following query retrieves all names that contain the letter “a”:

IEnumerable<string><string /> filteredNames = names.Where (n => n.Contains ("a"));

Thus the purpose of the lambda expression depends on the particular query operator. With the Where operator, it indicates whether an element should be included in the output sequence. The OrderBy operator indicates that the lambda expression maps each element in the input sequence to its sorting key. With the Select operator, the lambda expression determines how each element in the input sequence is transformed before being fed into the output sequence. This, in turn, mean that a lambda expression in a query operator always works on individual elements in the input sequence—not the sequence as a whole.

A Move From C# 2.0 to C# 3.0

At this point, we will review the concepts illustrated to weld them together and bring them into sharper focus. Below is basic code. We will create a delegate instance and then define a method that uses our new delegate as an input parameter. This particular method will accept a string array, and accept an instance of the delegate (some algorithm to perform on a string). Then, for each string in the array, it will apply the algorithm to it. Next, we create a method that -- by its method signature -- matches the delegate declaration. There is nothing that attaches these two together, just the method signature convention. However, when this method is passed as an argument to the PerformOperationOnStringArray method, it becomes an instance of the delegate. Now we can use our PerformOperationOnStringArray method, which knows how to loop through an array of strings (which is passed in as a first parameter) and apply an algorithm to it (i.e., the method that encapsulates the algorithm is passed in as a second parameter:

using System;
using System.Collections.Generic;
using System.Text;
public sealed class Program
    {
        
        public delegate bool FunctionForString(string s);
      
        public static string[] PerformOperationOnStringArray
			(string[] myStrings, FunctionForString myFunction)
        {
            System.Collections.ArrayList myList = new System.Collections.ArrayList();
            foreach (string s in myStrings)
            {
                if (myFunction(s))
                {
                    myList.Add(s);
                }
            }

            return (string[])myList.ToArray(typeof(string));
        }
        
        public static bool StartsWithA(string s)
        {
            return s.StartsWith("A");
        }
        
        static void Main(string[] args)
        {
            string[] myStrings = { "Adam", "Alan", "Bob", "Steve", "Jim", "Alberto" };

            string[] stringsA = PerformOperationOnStringArray(myStrings, StartsWithA);
            
            foreach (string s in stringsA)
                Console.WriteLine(s);

            Console.ReadLine();
        }
    }

Knowing that the magic of the delegate occurs at the compiler level rather than the CLR, we compile and execute the code, which outputs:

Adam
Alan
Alberto

Again, we are basically defining a function type, just as we would define a class or an object type. With the delegate keyword, we are defining the method signature and the return value. We have a method called PerformOperationOnStringArray(). This function passed a string array and some sort of function that performs some sort of operation on that string array, and it returns a string array. The idea is to iterate through each item in the sting array and call myFunction that is passed in. Now we haven't seen the definition for myFunction() yet; all we know is that it is of type FunctionForString() that we initially declared using the delegate keyword. It is the signature, or prototype, that we'll accept within this application. We see that FunctionForString just takes in a string value. At this point, all we are noting is that this particular method, PerformOperationOnString will take in an array of strings and will perform some action on each string in that array. Now notice the other method declaration: StartsWithA(string s). There is nothing that connects this method with our initial method declaration using the delegate keyword, except that both are passed a string and both return a Boolean value of either true or false. Now in the main body of code execution, we first create an array of strings that represent names. Then we create another array of strings that is empty to then call the PerformOperationOnStringArray method. We pass in the string array (of names) and then against each item in this array, we are going to execute the StartsWithA method that was previously mentioned. We execute the program and every string that begins with an A appears on the console screen. But how do we improve this code? By using anonymous delegates:

using System;
using System.Collections.Generic;
using System.Text;
public sealed class Program
    {
        
        public delegate bool FunctionForString(string s);

        public static string[] PerformOperationOnStringArray
		(string[] myStrings, FunctionForString operation)
        {
            System.Collections.ArrayList myList = new System.Collections.ArrayList();
            foreach (string s in myStrings)
            {
                if (operation(s))
                {
                    myList.Add(s);
                }
            }

            return (string[])myList.ToArray(typeof(string));
        }

       public static void Main(string[] args)
        {
            string[] myStrings = { "Adam", "Alan", "Bob", "Steve", "Jim", "Alberto" };

            string[] stringsA = PerformOperationOnStringArray
		(myStrings, delegate(string s) { return s.StartsWith("A"); });

            foreach (string s in stringsA)
                Console.WriteLine(s);

            Console.ReadLine();
        }
   }

The result is the same but without having to define a class. Now we will use a lambda expression to subsume the anonymous method:

using System;
using System.Collections.Generic;
using System.Text;

 public sealed class Program
    {

       public delegate bool FunctionForString(string s);


       public static string[] PerformOperationOnStringArray
		(string[] myStrings, FunctionForString operation)
        {
            System.Collections.ArrayList myList = new System.Collections.ArrayList();
            foreach (string s in myStrings)
            {
                if (operation(s))
                {
                    myList.Add(s);
                }
            }
            return (string[])myList.ToArray(typeof(string));
        }


       public static void Main(string[] args)
        {
            string[] myStrings = { "Adam", "Alan", "Bob", "Steve", "Jim", "Alberto" };

            string[] stringsA = PerformOperationOnStringArray
				(myStrings, (s => s.StartsWith("A")));

            foreach (string s in stringsA)
                Console.WriteLine(s);

            Console.ReadLine();

        }
    }

The results are the same, yet we are not using LINQ. We are just trying to understand the basics of the lambda expression in C# 3.0. But, early on in the study of lambda expression we find the use of generics. Even though the conceptual idea is collections, we will weld that in sequences and items:

using System;
using System.Collections.Generic;
using System.Text;
public sealed class Program
    {
   public delegate bool FunctionForAnything<T><t>(T item);
   public static string[] PerformOperationOnStringArray
	(string[] myStrings, FunctionForAnything<string><string /> operation)
        {
            System.Collections.ArrayList myList = new System.Collections.ArrayList();
            foreach (string s in myStrings)
            {
                if (operation(s))
                {
                    myList.Add(s);
                }
            }
            return (string[])myList.ToArray(typeof(string));
        }
public  static void Main(string[] args)
        {
            string[] myStrings = { "Adam", "Alan", "Bob", "Steve", "Jim", "Alberto" };
            string[] stringsA = PerformOperationOnStringArray
				(myStrings, (s => s.StartsWith("A")));
            foreach (string s in stringsA)
                Console.WriteLine(s);
            Console.ReadLine();
        }
    }

Guess what the result is.

Those who begin studying lambda expressions will be able to do the LINQ more effectively. LINQ plays a seriously significant role in .NET 3.5 and Visual Studio 2008. LINQ will have a direct impact on the use of SQL and XML, because the latter two are data-centric. LINQ enables the developer to query any collection using the IEnumerable<T>, whether an array, list, or XML DOM, as well as remote data sources, such as tables in SQL Server. The IEnumerable<T> interface is the foundation of using generics. Recall that the IEnumerable<T> represents a type whose contents can be enumerated, while IEnumerator<T> is the type responsible for performing the actual enumeration. Finally, anyone who will use LINQ must understand lambda expressions.

History

13^th July, 2009: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By

logicchild

Software Developer Monroe Community

United States

This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.