hits counter

LINQ To SQL Tips & Tricks: String Operations

LINQ With C# (Portuguese)

LINQ brought developers a very user friendly and domain independent style of writing queries.

The fact that the way queries are written is domain independent doesn’t mean that any query will compile the same way or even run the same way. You’ll always need to know how the provider will behave.

LINQ To Objects, for example, will compile queries as a Func<> delegate and the query methods will return IEnumerable(T) implementations.

On the other hand, LINQ To SQL will compile queries as an Expression<Func<>> (which is, in fact, an expression tree) instance and the query methods will return IQueryable(T) implementations.

Because LINQ To SQL queries are compiled to an expression tree, that allows the provider to treat the query elements as it sees fit.

In this case, this means that all operations that can be done on the database will be done on the database and the developer must be aware of this when she/he is writing the queries.

Lets take an example using the AdventureWorks database (if you don’t have it, you can download it from here).

I want to build a list of salutation for every employee that has the SalariedFlag set, in the form of:

[Mr.|Mrs.|Miss] <first name> <middle name> <last name>

But there’s also one detail about the data in the database: FirstName, MiddleName and LastName may have trailing spaces and I don’t want them.

This is a simple query like this:

var q1 = from e in context.Employees
where e.SalariedFlag
select
((e.Gender == 'F') ? ((e.MaritalStatus == 'S') ? "Miss" : "Mrs.") : "Mr.") + " " +
e.Person.FirstName.Trim() +
(e.Person.MiddleName == null || e.Person.MiddleName.Trim().Length == 0 ? " " : " " + e.Person.MiddleName.Trim() + " ") +
e.Person.LastName.Trim();

and it will be executed as:

SELECT ((((
    (CASE
        WHEN UNICODE([t0].[Gender]) = @p0 THEN
            (CASE
                WHEN UNICODE([t0].[MaritalStatus]) = @p1 THEN @p2
                ELSE @p3
             END)
        ELSE CONVERT(NVarChar(4),@p4)
     END)) + @p5) + LTRIM(RTRIM([t1].[FirstName]))) + (
    (CASE
        WHEN ([t1].[MiddleName] IS NULL) OR (LEN(LTRIM(RTRIM([t1].[MiddleName]))) = @p6) THEN CONVERT(NVarChar(MAX),@p7)
        ELSE (@p8 + LTRIM(RTRIM([t1].[MiddleName]))) + @p9
     END))) + LTRIM(RTRIM([t1].[LastName])) AS [value]
FROM [HumanResources].[Employee] AS [t0]
INNER JOIN [Person].[Person] AS [t1] ON [t1].[BusinessEntityID] = [t0].[BusinessEntityID]
WHERE [t0].[SalariedFlag] = 1
-- @p0: Input Int (Size = 0; Prec = 0; Scale = 0) [70]
-- @p1: Input Int (Size = 0; Prec = 0; Scale = 0) [83]
-- @p2: Input NVarChar (Size = 4; Prec = 0; Scale = 0) [Miss]
-- @p3: Input NVarChar (Size = 4; Prec = 0; Scale = 0) [Mrs.]
-- @p4: Input NVarChar (Size = 3; Prec = 0; Scale = 0) [Mr.]
-- @p5: Input NVarChar (Size = 1; Prec = 0; Scale = 0) [ ]
-- @p6: Input Int (Size = 0; Prec = 0; Scale = 0) [0]
-- @p7: Input NVarChar (Size = 1; Prec = 0; Scale = 0) [ ]
-- @p8: Input NVarChar (Size = 1; Prec = 0; Scale = 0) [ ]
-- @p9: Input NVarChar (Size = 1; Prec = 0; Scale = 0) [ ]
-- Context: SqlProvider(Sql2008) Model: AttributedMetaModel Build: 3.5.30729.4926

If you notice the query, there are a lot of text operations going on for each row.

Depending on the number of rows or database load this can prove to be very bad. The result might even be just a timeout.

So, how do we force the string operations to occur on the client instead of the database?

Only IQueryable<T> will be translated to T-SQL. So, all we need to do is change the type of the enumerator being iterated.

One way to do this is using the the AsEnumerable method of the Enumerable class.

The query would now be written as:

var q2 = from e in context.Employees.Where(e => e.SalariedFlag).AsEnumerable()
         select
            ((e.Gender == 'F') ? ((e.MaritalStatus == 'S') ? "Miss" : "Mrs.") : "Mr.") + " " + e.Person.FirstName.Trim() +
            (e.Person.MiddleName == null || e.Person.MiddleName.Trim().Length == 0 ? " " : " " + e.Person.MiddleName.Trim() + " ") +
            e.Person.LastName.Trim();

and it will be executed as:

SELECT
    [t0].[BusinessEntityID],
    [t0].[LoginID],
    [t0].[NationalIDNumber],
    [t0].[JobTitle],
    [t0].[MaritalStatus],
    [t0].[BirthDate],
    [t0].[Gender],
    [t0].[HireDate],
    [t0].[SalariedFlag],
    [t0].[VacationHours],
    [t0].[SickLeaveHours],
    [t0].[CurrentFlag],
    [t0].[rowguid],
    [t0].[ModifiedDate],
    [t1].[BusinessEntityID] AS [BusinessEntityID2],
    [t1].[PersonType],
    [t1].[NameStyle],
    [t1].[Title],
    [t1].[FirstName],
    [t1].[MiddleName],
    [t1].[LastName],
    [t1].[Suffix],
    [t1].[EmailPromotion],
    [t1].[AdditionalContactInfo],
    [t1].[Demographics],
    [t1].[rowguid] AS [rowguid2],
    [t1].[ModifiedDate] AS [ModifiedDate2]
FROM [HumanResources].[Employee] AS [t0]
INNER JOIN [Person].[Person] AS [t1] ON [t1].[BusinessEntityID] = [t0].[BusinessEntityID]
WHERE [t0].[SalariedFlag] = 1
-- Context: SqlProvider(Sql2008) Model: AttributedMetaModel Build: 3.5.30729.4926

As you can notice, text operations are no longer done on the database, but all the columns of both tables are being returned. And this is still a bad thing because we are using network bandwidth with data that won’t be used.

The way to choose the columns that will be retrieved from the database is by selecting only the ones wanted in the select statement. But because we still want string operations the be done on the client, we’ll need to project the desired columns into an intermediary object. Since we won’t need this object outside the query, we’ll use an anonymous type.

The query would now be written as:

var q3 = from n in
             (
                 from e in context.Employees
                 where e.SalariedFlag
                 select new
                 {
                     Gender = e.Gender,
                     MaritalStatus = e.MaritalStatus,
                     FirstName = e.Person.FirstName,
                     MiddleName = e.Person.MiddleName,
                     LastName = e.Person.LastName
                 }
             ).AsEnumerable()
         select ((n.Gender == 'F') ? ((n.MaritalStatus == 'S') ? "Miss" : "Mrs.") : "Mr.") + " " + n.FirstName.Trim()
         + (n.MiddleName == null || n.MiddleName.Trim().Length == 0 ? " " : " " + n.MiddleName.Trim() + " ")
         + n.LastName.Trim();

and it will be executed as:

SELECT
    [t0].[Gender],
    [t0].[MaritalStatus],
    [t1].[FirstName],
    [t1].[MiddleName],
    [t1].[LastName]
FROM [HumanResources].[Employee] AS [t0]
INNER JOIN [Person].[Person] AS [t1] ON [t1].[BusinessEntityID] = [t0].[BusinessEntityID]
WHERE [t0].[SalariedFlag] = 1
-- Context: SqlProvider(Sql2008) Model: AttributedMetaModel Build: 3.5.30729.4926

Notice the call to Enumerable.As Enumerable to translate the LINQ To SQL query into a LINQ To Objects query.

And, to end this long blog post, if you don’t use any string operations on the query, they, obviously, won’t be translated to T-SQL:

var q4 = from e in context.Employees
         where e.SalariedFlag
         select BuildSalutation(e.Gender, e.MaritalStatus, e.Person.FirstName, e.Person.MiddleName, e.Person.LastName);

where BuildSalutation is implemented as:

private static object BuildSalutation(char gender, char maritalStatus, string firstName, string middleName, string lastName)
{
    return ((gender == 'F') ? ((maritalStatus == 'S') ? "Miss" : "Mrs.") : "Mr.") + " "
        + firstName.Trim()
        + (middleName == null || middleName.Trim().Length == 0 ? " " : " " + middleName.Trim() + " ")
        + lastName.Trim();
}

and it will be executed as:

SELECT
    [t0].[Gender] AS [gender],
    [t0].[MaritalStatus] AS [maritalStatus],
    [t1].[FirstName] AS [firstName],
    [t1].[MiddleName] AS [middleName],
    [t1].[LastName] AS [lastName]
FROM [HumanResources].[Employee] AS [t0]
INNER JOIN [Person].[Person] AS [t1] ON [t1].[BusinessEntityID] = [t0].[BusinessEntityID]
WHERE [t0].[SalariedFlag] = 1
-- Context: SqlProvider(Sql2008) Model: AttributedMetaModel Build: 3.5.30729.4926

Have you noticed that this T-SQL query is pretty much the same in the previous example?

If you are still reading this, I hope you now aware of how you write your LINQ To SQL queries affect the generated T-SQL.

No Comments