The Linq between C# and C++
C# is the hot new language, unless you’re Don Box in which case it’s probably Ruby, but let’s pretend its C# for the purpose of this discussion.
:)
A future version of C# will allow you to write the following:
int[] numbers = { 10, 0, 9, 1, 8, 2, 7, 3, 6, 4, 5 };
var query = from n in numbers
where n < 5
orderby n
select n;
foreach (var n in query)
{
Console.WriteLine(n);
}
Running this code results in the following output:
0
1
2
3
4
If you dream in SQL then this code might make you drool. If you dream in C++ or C# then this might make you cringe. Whatever your persuasion, this code is intriguing and has the potential of dramatically simplifying certain types of coding patterns. Think about creating and consuming data in XML documents or relational databases.
Can this work? Is it type-safe? And where does it leave C++? Let’s take a look.
What on earth is ‘var’?
var is the C# rendition of the compromise certain strongly-typed languages are making to appease the onslaught of loosely-typed languages. In a future version of C# you will be able to declare a variable and leave the compiler to infer its type based on its initializer expression. Consider the following example:
Dictionary<int, string> dictionary = new Dictionary<int, string>();
Dictionary<int, string>.KeyCollection.Enumerator enumerator = dictionary.Keys.GetEnumerator();
There is a whole lot of type information that seems redundant to humans yet compilers appear to require it. Well no longer. Making use of the C# var keyword, the code can be simplified considerably (while the resulting IL remains the same):
var dictionary = new Dictionary<int, string>();
var enumerator = dictionary.Keys.GetEnumerator();
The ISO C++ committee has been moving in the same direction and approved a type deduction system for C++ that works in much the same way, with the obligatory syntactic sugar that we C++ developers love to hate. The new (old) auto keyword indicates that the type of the variable be deduced from the initializer expression. Consider the following C++ equivalent to the previous C# example:
Dictionary<int, String^> dictionary;
Dictionary<int, String^>::KeyCollection::Enumerator^ enumerator = dictionary.Keys->GetEnumerator();
Using the proposed auto keyword it can be simplified as follows:
Dictionary<int, String^> dictionary;
auto^ enumerator = dictionary.Keys->GetEnumerator();
So what is LINQ?
LINQ stands for Language-Integrated Query. Which language? Well any language that purports to target the future of the .NET Framework. Much of the attention around LINQ focuses on C#, being the poster child for the .NET Framework, but there is nothing stopping other languages from providing the language bindings necessary to integrate query facilities into the language.
To understand what LINQ really is in relation to C# we need to look under the covers. Here is the query declaration again:
var query = from n in numbers
where n < 5
orderby n
select n;
We have already discussed what var is for, but for the sake of this discussion let’s keep things explicit:
IEnumerable<int> query = from n in numbers
where n < 5
orderby n
select n;
C# uses patterns, not unlike the way C++ templates work, to translate query expressions into method calls. Because of this, the query expression is suitably type-safe and is not simply an expression evaluated at runtime as is the case with many loosely-typed languages. The query expression above can be rewritten using method calls and this is essentially what the compiler does on your behalf:
IEnumerable<int> _subset = Sequence.Where<int>(numbers,
n => n < 5);
IEnumerable<int> query = Sequence.OrderBy<int, int>(_subset,
n => n);
This now looks a lot more like C# but there is still the matter of the parameter expressions. These are known as C# lambda expressions, which provide a more concise syntax for writing anonymous methods. This can in turn be rewritten using anonymous methods as follows:
IEnumerable<int> _subset = Sequence.Where<int>(numbers,
delegate(int n) { return n < 5; });
IEnumerable<int> query = Sequence.OrderBy<int, int>(_subset,
delegate(int n) { return n; });
Of course anonymous methods are just shorthand for named methods:
var _subset = Sequence.Where<int>(numbers,
ConstraintFunction);
var query = Sequence.OrderBy<int, int>(_subset,
SelectFunction);
.
.
.
static bool ConstraintFunction(int n)
{
return n < 5;
}
static int SelectFunction(int n)
{
return n;
}
So as you can see, query expressions are much like “for each” statements where the compiler takes a simpler expression and produces the more verbose imperative code on your behalf. Writing the query expression is just so much simpler and to-the-point:
var query = from n in numbers
where n < 5
orderby n
select n;
Where does this leave C++?
Let’s start with what you can do today. Today you can already use the System.Query assembly, on which LINQ is based, and write the equivalent code as follows:
IEnumerable<int>^ _subset = Sequence::Where<int>(safe_cast<IEnumerable<int>^>(numbers),
gcnew Func<int, bool>(ConstraintFunction));
IEnumerable<int>^ query = Sequence::OrderBy<int, int>(_subset,
gcnew Func<int, int>(SelectFunction));
.
.
.
bool ConstraintFunction(int n)
{
return n < 5;
}
int SelectFunction(int n)
{
return n;
}
But I know what you’re saying, that syntactic sugar is just sprinkled on way too thick, and I agree. In the future we will hopefully be able to use automatic type inference and lambda expressions (as a language not library feature) to simplify constructs such as these. I hope the Visual C++ team continues the efforts they started with the Visual C++ 2005 release and pioneer modern language features in the Visual C++ compiler.
© 2006 Kenny Kerr