Deferred execution in Linq pitfall(s)

Say you have this query in Linq to Sql

// C#
int id = 10254;
var q = from o in nw.Orders
        where o.OrderID = id
        select o;

// some other code
id++;

foreach(var o in q)
{
// process o.
}

What order is fetched: 10254 or 10255? That's right, 10255! The 'id' used in the query is added as a member access node to the expression tree. As Linq expression trees are converted to Sql when they're executed, it means that at that time, the value of id is evaluated and used inside the query.

So I then wondered, what if 'id' is a local variable in a method which creates the query while the query is executed outside the method? That would mean the query holds a reference to a member which doesn't exist anymore in theory, as 'id' is allocated in the stackframe of the method.

// C#
IQueryable q = Foo();
foreach(var o in q)
{
// process o
}

//...
private IQueryable Foo()
{
    NorthwindDataContext nw = new NorthwindDataContext();
    int id = 10254;
    var q = from o in nw.Orders
             where o.OrderID = id
            select o;

    return q;    
}

As 'id' is inaccessable from outside the method Foo, I can't increase it. However the query does result in a normal query, where the order 10254 is fetched. To me, this is a little odd, as 'id' isn't known at the time of execution of q, as Foo's stackframe has been cleaned up. We've seen in our previous test that 'id's value isn't inserted into the query, but the reference to the variable. This either is a low-level CLR trick, or something fishy with stackframes which are already cleaned up. Am I overlooking something here or is this indeed strange? (the expression tree refers to a weird type (c__displayClass0) )....

7 Comments

  • I expect that what happens is since the id has a reference to it it isn't cleaned up. id is not in the stack in the call to this function, rather it is in the heap (at least that is how I remember internal variables). More interesting I would think would be what happens if you were to use a parameter in the query that was a value type to the function. That parameter should be declared on the stack and would really go away when the function ends (the internal variable would go away when it looses scope and the garbage collector picks it up), but I might be wrong here on the function param because I haven't read that part of the spec yet.

  • Miha: thanks for the link, I'll look into it ! :)

    Bill: the id is an int, so it should be declared on the stack, not on the heap, so that's why I was a little surprised why this happened, as it's a typical C/C++ buffer overflow scenario: char buffer allocated on stack is returned to caller.

  • I would assume it does something similar to what happens when you assign a local variable to a reference type on a non-local object. For example...

    public void mymethod(){
    int i = 5;
    globalObject.ReferenceProperty = i;
    }

    In this instance the CLR boxes i and puts it on the heap so that "ReferenceProperty" has something to point to. (In this instance "ReferenceProperty" would be a type such as "Object")

  • Even more interesting I would think is how managed C++ would handle this (there I was pretty sure an int would be declared on the stack, but now I am not so sure). Is LINQ even available in vs2008 managed C++? I don't think I have read anything about it.

  • Luke: that explains it indeed. (btw, I always have to look up what 'closures' means in the context of the material, it is IMHO a term which has too many meanings). Makes sense that it does it this way, it would otherwise be a big problem :)

  • IMHO MS should kill LINQ before it kills us all. This stuff is retarded.

    from o in nw.Orders? WTF does that mean anyway?

    I'm sure I can expect to fix all the BS this is going to cause in UI classes.

  • Chris: What is the problem. The syntax does make more sense than SQL dialect - you start with a from and end with a select.

Comments have been disabled for this content.