Archives

Archives / 2014 / February
  • LINQ query optimization by query rewriting using a custom IQueryable LINQ provider

    In my SkyLinq open source project so far, I tried to make LINQ better and created several extension methods (for example, see GroupBy and TopK) that are more memory efficient than the standard methods in some scenarios. However, I do not necessarily want people to bind into these extensions methods directly for the following reasons:

    1. Good ideas evolve. New methods may come and existing methods may change.
    2. There is a parallel set of methods in IEnumerable<T>, IQueryable<T> and to some degree, in IObservable<T>. Using custom LINQ extension breaks the parallelism.

    A better approach would be to write the LINQ queries normally using the standard library methods and then optimize the queries using a custom query provider. Query rewriting is not a new idea. SQL server has been doing this for years. I would like to bring this idea into LINQ.

    If we examine the IQueryable<T> interface, we will find there is a parallel set of LINQ methods that accepts Expression<Func> instead of <Func>.

    image

    There is a little C# compiler trick here. If C# see a method that accepts Expression<Func>, it would create Expressions of Lamda instead of compiled Lambda expression. At run time, these expressions are passed to an IQueryable implementation and then passed to the underlying IQueryProvider implementation. The query provider is responsible for executing the expression tree and return the results. This is how the magic of LINQ to SQL and Entity Frameworks works.

    The .net framework already has a class called EnumerableQuery<T>. It is a query provider than turns IQueryable class into LINQ to objects queries. In this work, I am going one step further by creating an optimizing query provider.

    A common perception of writing a custom LINQ provider is that it requires a lot of code. The reason is that even for the most trivial provider we need to visit the every possible expressions in the System.Linq.Expressions namespace. (Note that we only need to handle expressions as of .net framework 3.5 but do not need to handle the new expressions added as part of dynamic language runtime in .net 4.x). There are reusable framework such as IQToolkit project that makes it easier to create a custom LINQ provider.

    In contrast, creating an optimizing query provider is fairly easy. The ExpressionVistor class already has the framework that I need. I only needed to create a subclass of ExpressionVisitor called SkyLinqRewriter to override a few methods. The query rewriter basically replaces all call to the Queryable class with equivalent calls to the Enumerable class and rewrite queries for optimization when an opportunity presents.

    It is fairly easy to consume the optimizing query provider. All we need is to call AsSkyLinqQueryable() to convert IEnumerable<T> to IQueryable<T> and remaining code can stay intact:

    image

    An end-to-end example can be found at https://skylinq.codeplex.com/SourceControl/latest#SkyLinq.Example/LinqToW3SVCLogExample.cs.

    To conclude this post, I would recommend that we always code to IQueryable<T> instead of IEnumerable<T>. This way, the code has the flexibility to be optimized by an optimizing query provider, or be converted from pull based query to push based query using Reactive Extension without rewriting any code.

  • Expression tree visualizer for Visual Studio 2013

    Expression tree visualizer, as the name indicates, is a Visual Studio visualizer for visualizing expression trees. It is a must if you work with expressions frequently. Expression Tree Visualizer is a Visual Studio 2008 sample. There is a Visual Studio 2010 port available on codeplex. If you want to use it with a later version of Visual Studio, there is not one available. Fortunately, porting it to another version of Visual Studio is fairly simple:

    1. Download the original source code from http://exprtreevisualizer.codeplex.com/.
    2. Replace the existing reference to Microsoft.VisualStudio.DebuggerVisualizers assembly to the version in the Visual Studio you want to work with. For Visual Studio 2013, I found it in C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\ReferenceAssemblies\v2.0\Microsoft.VisualStudio.DebuggerVisualizers.dll on my computer.
    3. Compile the ExpressionTreeVisualizer project and copy ExpressionTreeVisualizer.dll to the visualizer directory. To make it usable by one user, just copy it to My Documents\VisualStudioVersion\Visualizers. To make it usable by all users of a machine, copy it to VisualStudioInstallPath\Common7\Packages\Debugger\Visualizers.

    If you do not want to walk through the process, I have one readily available at http://weblogs.asp.net/blogs/lichen/ExpressionTreeVisualizer.Vs2013.zip. Just do not sue me if it does not work as expected.