LINQ Performance Pitfall - Deferred Execution

Beware Pitfalls... / Daniel C. Griliopoulos, CC-BY-NC-SAWhen using LINQ, queries may bloat up to dozens of lines. My personal style is to take these queries and break them apart to smaller units of logic. To each unit of logic, I append a call to ToArray. @yosit asked me why I did it and I answered I was avoiding a possible pitfall. Here's what I meant.

Take the following code for instance:

static void Main(string[] args)
{
    int[] arr = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };

    var filter = from n in arr
                 where VeryLongOperation(n)
                 select n;

    var cartesian = from n in filter
                    from m in filter
                    select 2 * n + 3 * m;

    var result = cartesian.ToArray();
}

Imagine VeryLongOperation only allows numbers up to 3 and prints the number of times it was called to the debug output. It looks as though the very long operation will run only once per number, so you'll only have three calls, but here's the actual output you get in the debug window once you run this code:

Operation #1
Operation #2
Operation #3
...
Operation #36

This is caused by LINQ's deferred execution, which means that every time an item is taken from any of the loops, it will go back to the first filter and call the very long operation again. This means that you have 3 calls that cause 9 inner loops (27 calls together) and 6 that don't cause inner loop calls. 3 + 27 + 6 = 36.

Let's make a slight alteration to the original filter query:

var filter = (from n in arr
              where VeryLongOperation(n)
              select n).ToArray();

This forces LINQ to execute the query now. Sure, there's a slight overhead of creating a new array, but it's a static array, so that mitigates the problem a bit. Now the debug window looks like this:

Operation #1
...
Operation #9

This is a neat trick and is one of the first things that I look for when reviewing code with multiple LINQ statements.

4 Comments

  • Note that this is only true for Linq statement that run in memory.
    If this is being sent to a DB or another location for execution, this would probably much better to let that location to handle this.

  • Totally agree. This is only always true for LINQ to Objects and each other case of LINQ should be examined on its own accord.

  • This is very true of LINQ in all incarnations (Object, XML, and SQL). Even LINQ to SQL table calls perform deferred execution.

    I think this is going to be one of the most difficult things for people new to LINQ to get their head around. We are going to be seeing a lot of problems in code before people think about deferred execution.

  • Philip,

    It is true that the standard of LINQ is deferred execution, but it's not a rule. Also, as Oren (Ayende) said above, multiple statements sent to a database may be optimized better than you optimizing for it with .ToArray() calls.

Comments have been disabled for this content.