LINQ: Deferred Execution

This is part of a short series on the basics of LINQ:

In the first rant post of this short series on LINQ, I explained the motivation behind writing this series in the first place, which can be summarised as:

People don't know LINQ and that impacts my ability to make use of it; I should try to fix that.

To start, I'm going to explain what I believe is the most important concept in LINQ; deferred execution.

So, what is deferred execution?

Deferred execution code is not executed until the result is needed; the execution is put off (deferred) until later. By doing this we can combine a series of actions without actually executing any of them, then execute them at the time we need a result. This allows us to limit the execution of computationally expensive operations until we absolutely need them.

That's my description, here is one from an MSDN tutorial on LINQ-to-XML that perhaps puts it more clearly:

Deferred execution means that the evaluation of an expression is delayed until its realized value is actually required. Deferred execution can greatly improve performance when you have to manipulate large data collections, especially in programs that contain a series of chained queries or manipulations. In the best case, deferred execution enables only a single iteration through the source collection.

Some may be surprised to know that deferred execution was not new when LINQ arrived, it had already been around for quite some time in the form of iterator methods. In fact, it is iterator methods that give LINQ its deferred execution. Before we look at an iterator method, let's look at an example of immediate execution. For this example, we will give ourselves the task of taking a collection of people and outputting a collection of unique last names for all those born before 1980.

When this method is called, it iterates over the entire collection and then returns its result, which also can then be iterated. If the collection of people were huge and we only cared about the first five names, this would be incredibly slow. To turn this into deferred execution, we can write it like this:

Now, none of the code in this method is executed until the first time something calls MoveNext() on the returned enumerable. This means we could take the first five names without processing the entire collection of people, giving us a potentially enormous performance gain. If each item in the generated collection were computationally expensive to produce, without lazy evaluation, that expense would be multiplied by the total number of items in the collection on every call to the generator method; however, with lazy evaluation, the consumer of the collection gets to decide how many items are computed and therefore, how much work gets done. This ability to defer and control computationally expensive operations is the power of deferred execution.

However, not every deferred action necessarily has low overhead. Deferred execution actually comes in two flavours; eager evaluation and lazy evaluation (the example above is an example of lazy evaluation)1. Every action in a deferred execution chain uses either lazy or eager evaluation. Though lazy evaluation is preferred, sometimes it is not possible to evaluate one item at a time, such as when sorting. Eagerly evaluated deferred execution allows us to at least defer the effort until we want it done.

An eagerly evaluated version of the iterator method we have been looking at might look like this:

In this example, because it is still an iterator method (it returns its results using yield return), none of the code is executed until the very first time the MoveNext() method is called on the returned enumerable, and therefore, the execution is deferred. When MoveNext() gets called for the first time, the entire collection of data is processed at once and then the results are output one by one as needed. The difference between this and the immediate execution equivalent we first looked at is that in this version, no work is done until a result is demanded.

Allowing the consumer of a collection to control how much work is done rather than work being dictated by the collection generator allows us to manage data more efficiently by building chains of operations and then processing the result in one go when needed. Lazy evaluation gives us the additional ability to spread the effort across each call to MoveNext(). The key to writing good LINQ is understanding which actions are immediate, which are deferred and lazily evaluated, which are deferred and eagerly evaluated, and why it matters. We will take a look at that next time.


  1. Quite often, people use 'deferred execution' and 'lazy evaluation' interchangeably, but they are not actually synonymous, nor is 'immediate execution' synonymous with 'eager evaluation'. 

7 thoughts on “LINQ: Deferred Execution”

  1. I just read a blog post about HashSet and this seemed a nice fit to make your eager deferred execution method lazy. Nice post.

    1. FTFY, sorry for the annoying code formatting in comments. I'll look at improving that somehow. In the meantime, you can use <pre> for multiline code examples instead of <code> and it works.

      Thanks for your comment. You are correct, this can be written as a lazily evaluated method; my example was merely to demonstrate eager evaluation in deferred execution (there's also a lazy evaluation implementation in the post). As for being able to convert eager evaluation into lazy evaluation, that would only be possible in some situations, not all. For example, sorting cannot be implemented with lazy evaluation.

  2. Great explanation of the different evaluation types of deferred execution: lazy vs. eager. Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *