Linq: Beware of the 'Access to modified closure' demon

If you're using Linq and Resharper, you've probably seen the warning Resharper shows when you use a foreach loop in which you use the loop variable in a Linq extension method (be it on IQueryable<T> or IEnumerable<T>). In case you don't know what it is or what damage it can do if you ignore the issue, I'll give you a database oriented query (so on IQueryable<T>, using LLBLGen Pro's Linq provider) which creates a dynamic Where clause based on input, the typical scenario you should be careful with when it comes to this particular problem.

var customers = from o in metaData.Order
		join c in metaData.Customer on o.CustomerId equals c.CustomerId into oc
		from x in oc.DefaultIfEmpty()
		select new { CustomerId = x.CustomerId, CompanyName = x.CompanyName, Country = x.Country };

string searchTerms = "U A";
var searchCriteria = searchTerms.Split(new string[] { " " }, StringSplitOptions.RemoveEmptyEntries);
foreach(var search in searchCriteria)
{
    customers = customers.Where(p => p.Country.Contains(search));
}
var ids = (from c in customers select c.CustomerId).ToArray();

The above code snippet has the demon embedded into itself, likely without you noticing it. Can you spot it? (Ok, I already gave it away a bit with the foreach loop hint).

The problem with the above query is that it will produce a WHERE clause in the SQL query with two LIKE statements which both filter on %A%. How's that possible? The cause is in the 'Access to modified closure' problem: search is a local variable. The first time the foreach is ran, search will have the value "U". The .Where() extension method will add a MethodCall expression with a call to Where(lambda) with inside the lambda among other things a ConstantExpression referring to the local variable search for the value. And that's precisely the problem: when the foreach loop is looping again, search will get another value: namely "A". As there are no more values, the loop ends and the query is executed.

Well, executed is more complex than it sounds: first, the expression tree has to be converted into SQL. When the linq provider runs into the two .Where() extension method calls, it evaluates the argument, which is a LambdaExpression which contains a ConstantExpression which refers to... a local variable called search. It can't do anything else but reading that variable, which has the value ... "A" for both, as it reads the same variable. So it's not storing the constant value search has when the call to Where and Contains is made, it's storing a reference to the local variable.

How to fix this? It's pretty straight forward: create a new local variable:

foreach(var search in searchCriteria)
{
	var searchTerm = search;
	customers = customers.Where(p => p.Country.Contains(searchTerm));
}

With each iteration, it creates a new local variable, and thus each Contains call will refer to a different variable and thus the SQL query will contain the two LIKE predicates the way it should, one with %U% and one with %A%.

This subtle issue pops up with Linq to Objects as well, so beware when you pass the foreach loop variable to a Linq extension method: if the query doesn't run at that same spot, you likely will run into this problem and will have an obscure bug to track down.

Happy hunting

13 Comments

  • I've been bitten by this many, many times :)

  • Thank god we have Resharper :-)

  • @patrick: good question. In the initial query, the lambda in the Contains call refers to:
    LinqTester.Program+c__DisplayClass2.search

    c__DisplayClass2 is a class created by the compiler to add local variables to.

    When I add the local variable to the loop,
    It refers to LinqTester.Program+c__DisplayClass2.searchTerm, so that doesn't really make a difference at that level. I think at runtime, the local variable searchTerm is re-created every time the loop loops because it's inside a scope that's re-created. Not something you can optimize away, as it has to be a new scope every time. The loop variable of the foreach loop is the same for every iteration, as it's part of the scope the foreach statement / query is part of.

    Though I agree, if you think about it: the memory location of searchTerm can be re-used for every iteration, which could cause a problem as well, unless it's solved in the CLR. I'm not that into IL nor do I know the spec from front to back, so I can't give you a solid answer on that one.

  • Thanks for the info.

  • The funny thing is that if you check the C# 1.2 spec, the variable was scoped **inside** the loop, so it would have worked. It changed going to C# 2.0.

    I keep asking Mads etc if they'd consider changing the spec back to have it outside the loop, but no joy so far. I was hopeful they might fix it for 4.0, but I guess not.

    @Patrick - this can't and won't be optimized away; it changes the entire meaning.

  • Thanks Marc for the additional background info about the local var.

  • Resharper detects this and emits a warning when it detects a mutable variable in a closure.

  • @Think: that's what I mentioned in the first line of the post.

  • I don't understand this statement:
    "As there are no more values, the loop ends and the query is executed."

    This must be something I don't understand about Linq...why isn't the query executed during every loop interation? Why is it that the loop ends and then the query is executed? If after the Linq statement, you did:
    customers.Count, wouldn't that only work if the query was executed inside the loop?

  • @Mike: that's because of the 'deferred execution' of a linq query: it's executed when it's enumerated. So when the query returns an enumerable, it's executed when it's enumerated, otherwise it's executed directly. Yes, this sucks, but that's how MS designed it, unfortunately.

  • @Frans

    Ok, maybe I'm missing something else about the code then. I know that the query execution is deferred until the 'customers' is enumerated, but doesn't:
    foreach(var search in searchCriteria)
    {
    customers = customers.Where(p => p.Country.Contains(search));
    }

    just keep resetting the customers reference? It doesn't look like it is appending to a list of customers there.

  • @Mike: It doesn't reset the reference, as Where returns a new IQueryable :) so after an iteration of that loop, 'customers' _was_ an IQueryable and still _is_ an IQueryable however, the contents of that object has changed: it now contains a call to Where() with as input the filterlambda and the original customers IQueryable. It's a fluent interface :)

  • It really sucks to be using vb.net. I hate it when the compiler warns me about this exact gotcha, so I never make it :) Then all I get is understanding from the post, no pain relief!

Comments have been disabled for this content.