ASPInsiders Summit 2006 - C# 3.0 and LINQ

C# 3.0 (not to be confused with the confusingly-named .NET Framework 3.0, which includes C# 2.0, not C# 3.0) was the most exciting thing discussed at the ASPInsiders Summit. When I first learned a bit about LINQ at the 2005 summit, I didn't really get what was so great about taking some mangled SQL syntax and duct-taping it onto the language. Given a few hours for Anders Hejlsberg, the lead architect of C#, to explain LINQ, how it came to be, how it works behind the scenes, and why it's a Good Thing, I changed my mind. He and Scott Guthrie sold me on LINQ (at least, as much as I could be until it's released and I can play with it first-hand).

Most of the big changes in C# 3.0 were driven by LINQ, which is why I'm talking about them in conjunction. But this doesn't mean they're LINQ-specific; I can think of a ton of cases independent of LINQ where this stuff will be useful. It reminds me of when "XML web services" was repeated in every lecture, article, blog post, knowledge base article, and whitepaper from Microsoft that talked about the .NET Framework when it first came out, as if that was the main thing anyone would use the framework for. Though I've done a ton of .NET programming, I think maybe 1% of has had anything to do with web services. Thankfully, only few seem to have been fooled by that and adoption took off anyway.

But back to C# 3.0's new features. These are explained in more depth and with more examples on the LINQ Project site, so check out the overview there for more information--no need for me to duplicate all that.

Local variable type inference (or, "No, it's OK, the var keyword isn't evil anymore"). Former ASP programmers like me instinctively shudder when we see the keyword "var", and recoil a bit when we hear it's being introduced to C#. But this has nothing to do with late binding or strong typing--this is just something to allow programmers to be just a little lazier. The basic idea is if you have a long type like Dictionary<int, MyCoolButLongType> you're newing up on the right, you don't have to type all that on the left since the compiler already knows what type it is.

int i = 5;
Dictionary<int, Order> orders = new Dictionary<int, Order>();

is 100% equivalent to:

var i = 5;
var orders = new Dictionary<int, Order>();

The variable orders still has the same type of Dictionary<int, Order> it does in the first example. It'll still behave the same as it did, you'll still get the same Visual Studio IntelliSense you did, and the IL will be the same. Var is not variant here--it's "I'm lazy so let the compiler figure it out."

Extension methods. This allows you to "add" your own methods to types you can't otherwise change yourself, e.g. in the .NET Framework or third-party libraries. So if you've always wanted a string.FooBar() method, just write it as an extension method, include its namespace, and you've got your string.FooBar() method. Everyone's got a pet peeve this can address--some method they wish the framework provided, a string or numeric operation. Now it's yours, whatever you want, however you want it to work.

Lambda expressions. This is another feature for laziness/concision, making it easier to write a function inline than you could with C# 2.0's anonymous methods.

List<Customer> customers = GetCustomerList();
List<Customer> locals = customers.FindAll(c => c.State == "KY");

reads as "Find all c such that c.State equals Kentucky". The compiler infers that the return type of this lambda expression is boolean, due to the comparison operator ==.

Object initializers. This is one of the (few) good things about VBA (Visual Basic for Applications), the version of VB included with the Microsoft Office products and a platform on which I unfortunately found myself doing a lot programming several years ago. Using the new keyword to create an instance of an object, you can now specify initial values for public properties and fields.

Person value = new Person { Name = "Chris Smith", Age = 31 };

creates a new Person instance, setting its Person.Name and Person.Age properties.

Query expressions. This is what LINQ looks like on the surface. An example statement using a query expression on the right side:

var customerQuery = from c in customers
where c.State == "WA"
select new {c.Name, c.Phone};

where customers is a custom collection, like List<Customer>. One of the biggest questions I had is why in the world they took the SQL SELECT..FROM..WHERE syntax we all know and "love" and jumbled it up. Fortunately, Anders addressed this, saying Microsoft considered implementing LINQ syntax it in the same order as SQL, but really, SQL doesn't make sense. When you execute a SQL query, it starts with FROM and WHERE (table/index scans), then goes to SELECT, just like these query expressions in C#. C# doing it this way both helps get IntelliSense and is more forward-looking (what makes sense if we were designing this from scratch, and giving ourselves the option to expand it later) than backward-looking (how's it been done before; what are people used to). It looks weird the first time to those of us used to seeing SQL, but it makes sense after you get past that hump.

Expression trees. The .NET Framework (and C#) will have expression trees and expression parsing built in. So for example, you can give it a string, and it will build a tree of Expression objects behind the scenes to represent it. Anyone who's written their own search or query syntax can imagine how powerful this could be.

So look again at the above query expression. The C# compiler rewrites it as:

var customerQuery = customers.Where(c => c.State == “WA”).Select(c => new {c.Name, c.Phone});

Now you can see how all this begins to tie together: A query expression gets expanded into a more verbose and less readable statement using extension methods (List<Customer> doesn't have a native Where method) and lambda expressions. The lambda expressions get parsed into expression trees and return anonymous types (the return value of the Select lambda expression). We reference the results of the query expression with a local variable using type inference (we can't determine the return type of the query expression, and with type inference it doesn't matter--nevertheless it's still strongly typed). It's the circle of life!

Hopefully I'm explaining this clear enough that some of you can understand why at this point I was getting very impressed. Keep in mind all this compiler magic doesn't come at any efficiency/performance hit at runtime.

In case this syntax isn't enough to sell you on LINQ, here are a few more objective reasons. With just ADO.NET today, you have queries as quoted strings, parameters loosely bound, and results loosely typed, and the compiler can't help check code; LINQ helps in that classes describe data, the query is natural part of the language instead of a string, and the compiler provides IntelliSense and compile-time checking--it's entirely strongly typed in C# and VB. Another reason is that LINQ can be used to query against objects, DataSets, SQL data sources, XML out of the box using an extensible provider-driven model like those you find in .NET 2.0 today. And it has paging built-in, not specific to SQL Server.

Again, check out the LINQ Project Overview if you want more information, including a more detailed look at the features I mentioned and a few I skipped over.

No Comments