Archives / 2013
  • Review the WPF ICommand interface

    I have not worked with WPF for a while. Recently, I need to make enhancement to a WPF project that I did 20 months ago. It is time to refresh my memory on several WPF unique things. The first thing that came to my mind is WPF Commanding.

    In the old days when we create a VB 1-6 or Windows Forms applications, we just double click a button from the designer to create an event handler. With WPF, we can still do it the old way but there is a new way: we bind the Command property of the control to a property of the view model that implements ICommand. So what is ICommand?

    The ICommand interface has 3 members: Execute, CanExecute and CanExecuteChanged.


    Why the complications? That is because the WPF Commanding allows enable/disable the command sources such as button and menu item without resorting to spaghetti code.The basic workflow is simple:

    1. Whenever we change a state in the view model that we think that might affect CanExecute state of the command, we call the this.MyCommand.CanExecuteChanged() method.
    2. This will trigger the view to call the CanExecute method the command object to determine whether the command source should be enabled.
    3. When the command source is enabled and clicked, the Execute method of the command object is called; there is where the event handler would be implemented.

    The prism project has a nice implementation of ICommand called DelegateCommand. Its declaration is fairly simple:


    All we need is to instantiate it and give it a executeMethod call back and an optionally canExecuteMethod call back.

    Noticed that both CanExecute and Execute functions allow one argument? We could use the mechanism to bind multiple controls to a single command on the view model. We can use CommandParameter of the control to differentiate the controls.

  • Custom Web API Message Formatter

    ASP.NET Web API supports multiple message formats. The framework uses the “Accept” request header from the client to determine the message format. If you visit a Web API URL with internet explorer, you will get a JSON message by default; if you use Chrome, which sends “application/xml;” in the accept header, you will get a XML message.

    Message format is an extensivity point in Web API. There is a pretty good example on ASP.NET web site. Recently, when I worked on my SkyLinq open source cloud analytic project, I needed a way to expose data to the client in a very lean manner. I would like it to leaner than both XML and JSON, so I used csv. The example provided on the ASP.NET website is actually a CSV example. The article was lastly updated on March 8, 2012. Since ASP.NET is rapidly advancing, I found that I need to make the following modifications to make the example working correctly with Web API 2.0.

    1. CanReadType is now a public function instead of a proetected function.
    2. The signature of WriteToStream has changed. The current signature is


    3. When addition the Media Formatter to the Configuration, I have to insert my custom formatter into the formatters collection as the first. This is because the JSON Formatter actually recognizes the “application/csv” header and send JSON back instead. Example below:


  • Thoughts on programming by composition

    Programming by composition is about putting together a large, complex program from small pieces. In the past, when we talk about composition in the object-oriented world, we declare interfaces. We then pass objects from classes that implement interfaces to anyplace that accepts these interfaces. However, there are some limitations in this approach:

    1. Interfaces live in namespaces. So except for a few well known interfaces, most interfaces live in their private namespaces so components have to be explicitly aware of each other to compose.
    2. Interfaces with multiple members are often too big a unit for composition. We often desire smaller composition units.

    Composition is much easier in the functional world because the basic units are functions (or lambdas). Functions do not have namespace and functions are smallest units possible. For as long as the function signature matches, we can adapt them together.  In addition, some functional language relies on remarkably smaller number of types, for example, Cons in Scheme. Is there a middle ground we can do the same in C# which is an object-oriented language with some functional features?

    As the Linq library has demonstrated, types implementing IEnumerable<T> can be easily composed in a fluent style. In additional, my blog on Linq to graph shows that some more complicated data structures can be flattened to IEnumerable<T> and thus consumed by Linq. This takes care the problems where T is a simple type or a well known type. For more complicated data types, a type that implements IDictionary<string, object> can store arbitrary complex data structure as demonstrated in scripting languages like Javascript. Many data sources, such as a record from a comma separated file, or IDataReader are naturally Dictionary<string, object>. Many outputs of the programs such as JSON and XML are also IDictionary. The remaining question is whether we can use IDictionary in the heart of the programs.

    Unfortunately, C# code to access dictionary is ugly; there are currently no syntactic sugars to make them pretty (while VB.NET has). A combination of C# dynamic keyword and DynamicObject is the closest thing to make IDictionary access pretty but DynamicObject carries a huge overhead. So the solution that I propose is to create a strongly-typed wrapper over the underlying dictionary when strong-type is needed. This is not a new idea; strongly-typed dataset already uses it. We can in fact use C# dynamic to duck-typing to strongly-typed wrapper from any namespaces. The duck-typing code generated by C# compiler is far more efficient than DynamicObject.  It is fairly easy to generate strongly-typed wrapper code from the meta data and templates and I intent to demonstrate that in my later blog posts.

    For an example of IDictionary implementation over delimited file record, see the Record class in SkyLinq. For an example of strongly-typed wrapper, see the W3SVCLogRecord class. I do not have many composition example yet, but I promise that as the SkyLinq project unfold the entire system would build on composition.

  • Be aware of the memory implication of GroupBy in LINQ

    LINQ functions generally have very small memory footprint. LINQ functions usually use lazy evaluation whenever possible. An item is not evaluated until MoveNext() and Current of the Enumerator are called. It only needs memory to hold one element at a time.

    GroupBy is an exception. An IEnumerable<T> are often grouped into one IEnumerable<T> for each group before it is further processed. LINQ often has no choice but to accumulate each group using a List<T> like structure, resulting in the entire IEnumerable<T> loaded into memory at once.

    If we are only interested in the aggregated results of each group, we can optimize GroupBy with the following implementation:


    We use a dictionary to hold the accumulated value for each group. Each element from IEnumerable<T> is immediately consumed by the accumulated as it is read in. This way, we only need as much memory to hold the number of groups, far less than the entire IEnumerable<T> if each group has a large number of elements.

    If the IEnumerable<T> is ordered by the key order, we can further optimize the code. We would need memory to hold only one group at a time.

    As usual, all LINQ samples are available in the codeplex SkyLinq project. 

  • LINQ to Graph

    As promised in my previous, I now provides some details about an example I gave during my talk at SoCal Code Camp titled “LINQ to Objects A-Z”. In this blog, I will discuss LINQ to Graph.

    LINQ to object has a large number of functions to query IEnumerable<T>. What if we are dealing with a hierarchy? It turns out LINQ can flatten one-level-down hierarchies with one of the SelectMany methods. For deeper hierarchies, we need to do either the depth-first-search (DFS) or the breath-first-search (BFS). In this post, I will provide generic implementations of BFS and DFS. These functions return an IEnumerable<T> that can subsequently processed with LINQ.

    With DFS, we need to recursively place exposed nodes on a stack and search each node until nothing left on the stack. With BFS, we use a queue instead of a stack to store unsearched nodes. The implementation of DFS is relatively simple as recursive function calls already provide a stack implicitly. The following code snippets show the DFS implementation.


    The DFS function accept the three arguments. The first argument is the starting node. The second argument getInners is a lambda function that returns the child inner nodes of a parent node. The third argument is a lambda function that returns the leaf nodes of a parent node.

    The following code snippets demonstrate using DFS to search a direction for files recursively. We pass the path of the starting directory as the first argument. We use Directory.EnumerateDirectories to implement the getInners function and use Directory.EnumerateFiles to implement the getLeafs function.


    The implementation of BFS is longer. We need to implement a queue explicitly. Two helper functions are defined to help managing the queue. The BFS function can be used in place of DFS in the recursive problem; only the order of the outputs will change.


    The source code of this blog could be found at In future blogs, I will continue discussing application of LINQ in combinatorial problems.

  • Is it time for cloud-based ASP.NET IDE? (round 2)

    8 months ago, I asked whether it is time for cloud-based ASP.NET IDE. I have long been dreaming of being able to create web application on the spot while talking to users. I was able to do that 20 years ago with VB3. Today, the closest thing I can do with web application is with a CMS like Orchard. To work on a live website, we need an editor that is accessing the live site. We also need a tool to indicate the link between the html in the browser and the code that generate the html.

    A lot has happened in the past 8 months. For the cloud based editor, first I saw Scott Hanselman’s blog about Microsoft’s own cloud editor. Then we found this editor to appear in Visual Studio Online.

    For the tools that link html with source code, I first saw the very impressive shape tracing tool in Orchard. Then we saw the browser link and remote debugging feature in Visual Studio 2013.

    So whether the IDE itself is in the cloud or not, the new VS2013 features together with the Azure feature of deploying directly from a repository brings up ever close to being able to work on a live web application in front of a customer.

  • Gave a talk at SoCal Code Camp at USC today titled “Linq to Objects A-Z”

    I gave a talk at SoCal Code Camp on Linq to Objects. With careful categorization of Linq functions, I was able to cover the entire set of Linq functions in only 35 minutes. I was able to spend the rest time on demos.

    In my first demo, I show I was able to write a top 20 URL type of query using 4 lines of library code and 9 line of Linq code without tools like Log Parser. I also demonstrated that I only need to change 2 lines of code from querying a single log file to a whole directory of log files. It would be as simple to run the query against multiple servers in parallel.

    In my second demo, I discussed how to turn into graph depth-first-search (DFS) and breath-first-search (BFS) in the a Linq queryable problem. The class LingToGraph contains the only DFS and BFS code I ever have to write; the rest could be done the the lambda passed to the DFS or BFS calls.

    In future blogs, I will provide more details explanation of code.


    Link to Powerpoint slides.

    Link to demos.

  • Delegate performance of Roslyn Sept 2012 CTP is impressive

    I wanted to dynamically compile some delegates using Roslyn. I came across this article by Piotr Sowa. The article shows that the delegate compiled with Roslyn CTP was not very fast. Since the article was written using the Roslyn June 2012, I decided to give Sept 2012 CTP a try.

    There are significant changes in Roslyn Sept 2012 CTP in both C# syntax supported as well as API. I found Anoop Madhisidanan’s article that has an example of the new API. With that, I was able to put together a comparison. In my test, the Roslyn compiled delegate is as fast as C# (VS 2012) compiled delegate. See the source code below and give it a try.

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Diagnostics;
    using Roslyn.Compilers;
    using Roslyn.Scripting.CSharp;
    using Roslyn.Scripting;
    namespace RoslynTest
        class Program
            public Func del;
            static void Main(string[] args)
                Stopwatch stopWatch = new Stopwatch();
                Program p = new Program();
                p.SetupDel(); //Comment out this line and uncomment the next line to compare
                int result = DoWork(p.del);
                Console.WriteLine("Time elapsed {0}", stopWatch.ElapsedMilliseconds);
            private void SetupDel()
                del = (s, i) => ++s;
            private void SetupScript()
                //Create the script engine 
                //Script engine constructor parameters go changed 
                var engine=new ScriptEngine(); 
                //Let us use engine's Addreference for adding the required 
                    typeof (Console).Assembly, 
                    typeof (Program).Assembly, 
                    typeof (IEnumerable<>).Assembly, 
                    typeof (IQueryable).Assembly 
                }.ToList().ForEach(asm => engine.AddReference(asm)); 
                //Now, you need to create a session using engine's CreateSession method, 
                //which can be seeded with a host object 
                var session = engine.CreateSession();
                var submission = session.CompileSubmission>("new Func((s, i) => ++s)");
                del = submission.Execute();
                //- See more at:       
            private static int DoWork(Func del)
                int result = Enumerable.Range(1, 1000000).Aggregate(del);
                return result;


    Since Roslyn Sept 2012 CTP is already over a year old, I cannot wait to see a new version coming out.

  • Gave 3 presentations at SoCal Code Camp (UCSD) today

    I gave 3 presentations at SoCal Code Camp today at University of California, San Diego.

    The first two presentations are co-presented with our summer intern Christopher Chen. Clicks the links below to download the power point.

    Creating an Orchard website on Azure in 60 minutes

    Customizing Orchard Websites without limit

    My source code can be found on github:

    Lastly, several people asked whether we can create mailing list to keep in touch. I created a Linkedin group called SoCal Orchard SIG. Welcome to join!

    My last talk was:

    Machine learning made simple

  • Asp.NET ReportViewer “report execution has expired or cannot be found” error when using session state service or SQL Server session state

    We encountered an error like:

    ReportServerException: The report execution x5pl2245iwvvq055khsxzlj5 has expired or cannot be found. (rsExecutionNotFound)]
       Microsoft.Reporting.WebForms.ServerReportSoapProxy.OnSoapException(SoapException e) +72
       Microsoft.Reporting.WebForms.Internal.Soap.ReportingServices2005.Execution.ProxyMethodInvocation.Execute(RSExecutionConnection connection, ProxyMethod`1 initialMethod, ProxyMethod`1 retryMethod) +428
       Microsoft.Reporting.WebForms.Internal.Soap.ReportingServices2005.Execution.RSExecutionConnection.GetExecutionInfo() +133
       Microsoft.Reporting.WebForms.ServerReport.EnsureExecutionSession() +197
       Microsoft.Reporting.WebForms.ServerReport.LoadViewState(Object viewStateObj) +256
       Microsoft.Reporting.WebForms.ServerReport..ctor(SerializationInfo info, StreamingContext context) +355

    [TargetInvocationException: Exception has been thrown by the target of an invocation.]
       System.RuntimeMethodHandle._SerializationInvoke(Object target, SignatureStruct&amp; declaringTypeSig, SerializationInfo info, StreamingContext context) +0
       System.Reflection.RuntimeConstructorInfo.SerializationInvoke(Object target, SerializationInfo info, StreamingContext context) +108
       System.Runtime.Serialization.ObjectManager.CompleteISerializableObject(Object obj, SerializationInfo info, StreamingContext context) +273
       System.Runtime.Serialization.ObjectManager.FixupSpecialObject(ObjectHolder holder) +49
       System.Runtime.Serialization.ObjectManager.DoFixups() +223
       System.Runtime.Serialization.Formatters.Binary.ObjectReader.Deserialize(HeaderHandler handler, __BinaryParser serParser, Boolean fCheck, Boolean isCrossAppDomain, IMethodCallMessage methodCallMessage) +188
       System.Runtime.Serialization.Formatters.Binary.BinaryFormatter.Deserialize(Stream serializationStream, HeaderHandler handler, Boolean fCheck, Boolean isCrossAppDomain, IMethodCallMessage methodCallMessage) +203
       System.Web.Util.AltSerialization.ReadValueFromStream(BinaryReader reader) +788
       System.Web.SessionState.SessionStateItemCollection.ReadValueFromStreamWithAssert() +55
       System.Web.SessionState.SessionStateItemCollection.DeserializeItem(String name, Boolean check) +281
       System.Web.SessionState.SessionStateItemCollection.DeserializeItem(Int32 index) +110
       System.Web.SessionState.SessionStateItemCollection.get_Item(Int32 index) +17
       System.Web.SessionState.HttpSessionStateContainer.get_Item(Int32 index) +13
       System.Web.Util.AspCompatApplicationStep.OnPageStartSessionObjects() +71
       System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint) +2065

    This error occurs long after the report viewer page has closed. It occurs to any page in the application, rendering the entire application unusable until the user gets a new session.

    The cause of the problem is that the ReportViewer uses session state. When a page retrieves session from any out-of-state session, the session variable of type Microsoft.Reporting.WebForms.ReportHierarchy is deserialized from the session storage. The deserialization could cause the object to connect to the report server when the report is no longer available.

    The solution is simple but not pretty. We need to clean up the session variable when the report viewer page is closed. One way is to add some Javascript to the page to handle the window.onunload event. In the event handler, call a web service to clean up the session variable. The name of the session variable appears to be randomly generated. So we need to loop through the session variable to find a variable of the type Microsoft.Reporting.WebForms.ReportHierarchy. Microsoft has implemented pinging between the report viewer and the report server to keep the report alive on the server when the report viewer is up; I hope they will go one step further to take care of this problem.

  • Running an intern program

    This year I am running an unpaid internship program for high school students. I work for a small company. We have ideas for a few side projects but never have time to do them. So we experiment by making them intern projects. In return, we give these interns guidance to learn, personal attentions, and opportunities with real-world projects.

    A few years ago, I blogged about the idea of teaching kids to write application with no more than 6 hours of training. This time, I was able to reduce the instruction time to 4 hours and immediately put them into real work projects. When they encounter problems, I combine directions, pointer to various materials on w3school, Udacity, Codecademy and UTube, as well as encouraging them to  search for solutions with search engines. Now entering the third week, I am more than encouraged and feeling accomplished. Our the most senior intern, Christopher Chen, is a recent high school graduate and is heading to UC Berkeley to study computer science after the summer. He previously only had one year of Java experience through the AP computer science course but had no web development experience. Only 12 days into his internship, he has already gain advanced css skills with deeper understanding than more than half of the “senior” developers that I have ever worked with. I put him on a project to migrate an existing website to the Orchard content management system (CMS) with which I am new as well. We were able to teach each other and quickly gain advanced Orchard skills such as creating custom theme and modules. I felt very much a relationship similar to the those between professors and graduate students. On the other hand, I quite expect that I will lose him the next summer to companies like Google, Facebook or Microsoft.

    As a side note, Christopher and I will do a two part Orchard presentations together at the next SoCal code camp at UC San Diego July 27-28. The first part, “creating an Orchard website on Azure in 60 minutes”, is an introductory lecture and we will discuss how to create a website using Orchard without writing code. The 2nd part, “customizing Orchard websites without limit”, is an advanced lecture and we will discuss custom theme and module development with WebMatrix and Visual Studio.

  • Fun with Orchard, Webmatrix, Git and Azure

    Why Orchard?

    Our company wants to convert an old html site to a content management system. We considered between Wordpress and Orchard and picked Orchard. This post strongly influenced us. It is ASP.NET MVC vs php. We are a .net shop. We thought it would be easier to do custom development with Orchard.

    Why Webmatrix?

    There are a few reason we used Webmatrix:

    1. Webmatrix works with Orchard very well. Many Orchard training materials use Webmatrix.
    2. Webmatrix is free. We have interns here working on Orchard. We do not have to consume a Visual Studio license.
    3. Webmatrix has an excellent story working with both Git and Azure.

    Why Git?

    We would like to have a version control system. Git is a free and open source distributed version control system. There are several free Git hosts, such as GitHub, Codeplex, and BitBucket. So we picked Git.

    Why Azure?

    We would like to have website that our entire team can see. Azure web site is an excellent option for us. It is very easy to host Orchard with an Azure website, either with Sql Compact Edition or with Azure SQL Database. Azure also integrates with Git very well.

    So how do they work together?

    Although it is possible to edit a Azure hosted web site directly with Webmatrix, we hosted out source code in Git because we need a source control system.

    It is very easy to work with Git from Webmatrix. One can use git with an existing site or open a site directly from git.

    The Sql Server migration solution in Webmatrix is amongst the easiest way to migrate a Sql CE database to Azure SQL Database.

    It is also fairly easy to setup an Azure website to pull source code from git directly and build the source code. Each time when we push a changeset to git, Azure is notified and it will automatically pull and build the website. For Orchard, the “built” is actually not more than an xcopy. We just need to embed a .deployment file in our source code. See David Haden’s post for more details.

  • The history.back and refresh problem

    It is fairly common in applications that we have a list page/details page pairs. When we need to add a new record or edit an existing record, we click on the “Add New” link or the “Edit” link from the list page to go to the details page. When we finish adding or editing, we go back to the list page and expect to see the information updated.

    At the first glance, developer can use history.back if no post-back occurred or use response.redirect if a post-back occurred. However, what if the list page has a filter or is at a page number other than the first? How do we return to where we were on the list page?

    If we search Bing on “history.back and refresh”, we can see a number of discussions. However, there are no good solutions. It is difficult to have a Javascript only solution because Javascript has the life time of a page. So we need a bit of assistance from the server but we do not want to leave session variables behind because we would leave many session variables behind if the application has many list/detail pairs. The following is one possible solution to the problem:

    1. Change links to the detail page on the list page to post-backs . For ASP.NET Web Form, this can be easily done by changing links to LinkButtons.
    2. Upon posting back to the list page, capture the current filter criteria and page number. Save the information to a session variable and do a response.redirect to the details page.
    3. From the details page, save the filter data from the list page into a hidden control or use ViewState if using ASP.NET Web Form. Remove the session variable.
    4. Use post-back in the details page for both “Save” and “Cancel”. Save the filter data into the session again and do a response.redirect to the list page.
    5. From the list page, check the existence of a session variable for filter. If it exists, populate the filter and page number and remove it from the session. We have thus returned to where we were on the list page and have left no trace of session variables behind.
  • Controlling the display of ASP.NET validator controls

    Recently, I worked on an old webform project and used the validation controls. I previously used some features of validation controls, but I encountered more scenarios this time. I found these controls are fairly flexible and can be configured to work in many different ways:

    Validator Controls

    The most important properties that control the display are Text, ErrorMessage and Display properties.

    What is the difference between the Text and ErrorMessage properties. According to MSDN:

    If the Text property is set, that text will override the text specified in the ErrorMessage property and appear in the validation control. However, the text specified by the ErrorMessage property always appears in the ValidationSummary control.

    One scenario is that you might set Text to “*” and ErrorMessage to the description of the error. You will see an “*” next to the control to validate and a details message in the validation summary section.

    The display property takes 3 values:

    Display Behavior Description
    The validation message is never displayed inline.
    Static Space for the validation message is allocated in the page layout.
    Dynamic Space for the validation message is dynamically added to the page if validation fails.

    To show error only in the validation summary section, set the Display property of the validator controls to None.

    Validation Summary Controls

    The most important properties that control the display are HeaderText, DisplayMode, ShowSummary and ShowMessageBox properties.

    If we turn Show Summary off and ShowMessageBox on, the summary will be displayed as a message box. Note that it is necessary to leave EnableClientScript to true to use ShowMessageBox.

    Supposing we only want to display a “*” next to each control and a generic message portion, we only need to test Text of each validator control to “*”, ErrorMessage to blank and set the generic message to the HeaderText of the ValidationSummary control.

    The ASP.NET validator controls have a good combination of properties to work out all the scenario that I encountered in the project.


    We are planning to update our very old web application so that it works better with modern web browsers. I have heard of in the past and decided to give it a try. The site has a scanner that can scan a URL entered by a user. Unfortunately, our site requires log-in before getting to the page we want to scan. Fortunately, the site also offers a downloadable scanner that we can use in our local development environment.

    I downloaded the scanner. It is packaged as a zip file. I unzipped the package and open the readme file. The tools requires node.js to work. Fortunately, it is fairly easy to setup the tool following the instructions. Firstly, download and install node.js; accept all defaults. The setup will add Node.js to the path. Although Node.js is known for hosting asynchronous web applications, it is actually just a javascript executing environment. Next, I open the command prompt, change to the directory that I unzipped the tool and run “node lib/service.js”. A node hosted web server is listening on port 1337 so we just need to open a web browser and points to http://localhost:1337/. We can then enter the URL of a web page to scan. The scanner submits the data to and a report is generated. The report is very good and it offers many suggestions and links that helps us to improve the page.

  • Is it time for cloud-based ASP.NET IDE?

    We already have a standard IDE in Visual Studio. We also have a very innovative IDE in WebMatrix that can work with node.js and php in addition to ASP.NET. Why do we need another IDE? Well, let me first talk about the feasibility and then throw in some ideas on what we could do with it.

    I have noticed CodeMirror for a long time. Recently, I have also noticed ACE on which the Cloud9 IDE is based. I was impressed that both CodeMirror and ACE supported Typescript soon after it was released. I was also impressed with the Cloud9 IDE that can debug Node.js code. All these projects have benefited from rapid improvement of the Javascript engine in modern browsers and I think that the era for very functional web-based IDEs has arrived.

    So what can we do with a cloud-based IDE? Firstly, like what Cloud9 demonstrated, the code lives on the cloud. Secondly, people can collaborate. Thirdly, I think it is possible to improve experience to what we have never seen before, so let me elaborate below.

    In Visual Studio, when we want to see how the ASP.NET page looks like, we switch to the designer mode but that is not really close to what we see at runtime. In a cloud-based IDE, we can see the code and how it renders side-by-side, like what JsFiddle has demonstrated. ASP.NET does not execute the code on the website directly; it first compiles the code into the “Temporary ASP.NET Files” directory and executes the assemblies from there. That makes it feasible to edit and run the code at the same time.

    Visual Studio has been heavy on ORM (i.e., entity framework) and lighter on code-generation. I have been favoring code-generation over ORM. This is because with ORM we are tweaking a black box to generate desired SQL. With code-generation, we can see exactly what we get and tweak the templates when necessary. Visual Studio has limited runtime information; it gets its information either from parsing the code, or from meta-data stored in XML files. With a cloud-based IDE that is working with running code, it is possible to get richer runtime information and generate code (or scaffold) under much wider scenarios. So how do we map the server-side code to html in the browser? That is where ideas like source map could help, and we have very good tools in querying DOM already.

    So I believe all the technical pieces for a cloud-based ASP.NET IDE are available.

  • Lessons from the ASP Classic Compiler project

    I have not done much with my ASP Classic Compiler project for over a year now. The lack of additional work is due to both the economic and the technical reasons. I will try to document the reasons here both for myself and for would–be open source developers.

    How the project got started?

    I joint my current employer at the end of 2008. The company had a large amount of ASP classic code in its core product. The company was in talk with a major client on customization so that there was a period that the development team had fairly light load. I had a chance to put in a lot of thinking on how to convert the ASP classic code to ASP.NET. One of the ideas was to compile ASP Classic code into .net IL so that they can be executed within the ASP.NET runtime.

    I do not have a formal education in computer science; my Ph.D. is in physics. I had been very fond of writing parsers by following the book “Writing Compilers & Interpreters” since 1996. I wrote a VBA like interpreter and a mini web-browser, all in Visual Basic, using the knowledge acquired. Nevertheless, a full-featured compiler is still a major undertaking. I spent several months of my spare-time working on the parser and compiler, and studied the theory behind. I was able to implement all the VBScript 3.0 features and the compiler was able to execute several Microsoft best-practice applications. The code runs about twice as fast as ASP Classic. I had lots of fun experimenting with ideas such as using StringBuilder for string concatenation to drastically improve the speed of rendering. I was awarded Microsoft ASP/ASP.NET MVP in 2010. In 2011, I donated the project to the community by opening the source code after consulting with my employer.

    Since then, I faced challenges on two fronts: adding VBScript 5.0 support and solving a large number of compatibility issues in the real-world. After working for a few more months, I found it difficult to sustain the project.

    Lesson 1: the economy and business issues

    1. Sustainability of an open source project requires business arrangements. I am an employee. Technical activities outside the work are welcome as long as it benefits the employer. However, excess amount of activities not relating to the work could be a loyalty issue.

    2. ASP Classic is a shrinking platform. So ASP Classic Compiler would not be a sound investment for most businesses except for Microsoft who has interest in bridging customers from older to newer platform painlessly.

    Lesson 2: the technical issues

    1. VBScript is a terrible language without a formal specification and test suites. The best “spec” outside of the users’ guide is probably Eric Lippert’s blog. It is not getting much maintenance and it is losing favor. As a result, few other developers with knowledge in compiler would have interest in VBScript.

    2. Like the IronPython project and the Rhino projects, I started by writing a compiler. I was very shrilled that my first implementation without much optimization is about twice as fast as VBScript in my benchmark test. However, I had many obstacles on compatibility and adding the debug feature. If I would do it again, I would probably implement it as an interpreter with an incremental compiler. The interpreter would have a smaller start-up overhead. I can then have a background thread compiling the high usage code. The delayed compiling would allow me to employee more sophisticated data-flow analysis and construct larger call-site blocks. The current naïve implementation on DLR results in too many very small call-sites so that type-checking at call-sites is a significant overhead.

    In summary, I had lots of fun with the project. I significantly improved my theoretical knowledge of algorithms in general, and of parsers and compilers. I had the honor of being awarded Microsoft ASP.NET/IIS MVP for the past 3 years and enjoyed my private copy of MSDN subscription at home. After a working prototype, the remaining work to take it to a production quality software is more a business problem than else. I am actively contemplating some new forward-looking ideas and wish I have some results to share soon.

  • ASP.NET and Open Source

    I just came back from the Microsoft MVP Summit 2013. I was surprised and excited to learn there are a large number of open source projects from both inside and outside of Microsoft. There is also a strong support for open source frameworks in Visual Studio. I am glad to see that the Microsoft ASP.NET team has done a great job supporting open source and the community is going strong.

    Open source projects from the ASP.NET team

    Those who interested in the open source projects from the ASP.NET team should first visit This site is a portal to many ASP.NET features that Microsoft has opened the source code.

    Next, readers should visit This site is the home of the latest source code of MVC, Web API and Web Pages.

    The following are several open source projects that have been incorporated into Visual Studio:

    The following projects are considered experimental:

    Projects from outside of Microsoft

    If you think there are no needs for another framework since ASP.NET MVC is already great, you would be surprised to find out some open source alternatives have actually attracted many followers:

    Open Source Client-Side MVC or MVVM frameworks supported by Microsoft Visual Studio

    Microsoft ASP.NET and Web Tools 2012.2 comes with a single page application (SPA) project template that uses KnockoutJS. However, Mads Kristensen also built projects templates for several other highly popular JavaScript SPA libraries and frameworks: Breeze, EmberJS, DurandalJS and Hot Towel. Visit for download links. 

    Don’t forget to install Mads’ Web Essentials extension for Visual Studio 2012. You will be pleasantly surprised by the number of features that this extension adds to Visual Studio 2012.

    Final Notes

    I gathered these links for myself and others. If I missed any links, please post a comment or send me a note. I will be glad to update this page.

  • Build a dual-monitor stand for under $20

    Like many software developers, my primary computer is a laptop. At home, I like to connect it to an external monitor and use the laptop screen as the secondary monitor. I like to bring the laptop screen as close to my eye as possible. I got an idea when I visited IKEA with my wife yesterday.

    IKEA carries a wide variety of legs and boards. I bought a box of four 4” legs for $10 (the website shows $14 but I got mine in the store for $10) and a 14” by 46” pine wood board for $7.50. After a simple assembly, I got my sturdy dual-monitor stand.

    Dual-monitor stand

    I place both my wireless keyboard and mouse partially underneath the stand. That allows me bring the laptop closer. The space underneath the stand can be used as temporary storage space. This simple idea worked pretty well for me.

  • Importing a large file into Sql Server database employing some statistics

    I imported a 2.5 GB file into our SQL server today. What made this import challenging is SQL Server Import and Export Wizard failed to read the data. This file is a tab delimited file with CR/LF as record separator. The lines are truncated if the remaining line is empty, resulting variable number of columns in each record. Also, there is a memo field that contains CR/LF so that it is not reliable to use CR/LF to break the records.

    I had to write my own import utility. The utility does the import in 2 phases: analysis and import phases. In the first phase, the utility reads the entire file to determine the number of columns, the data type of each column, its width and whether it allows null. In order to address the CR/LF problem, the utility uses the following algorithm:

    Read a line to determine the number of columns in the line.

    Peek the next line and determine the number of columns in the second line. If the sum of the two numbers is less than the expected column count, the second line is a candidate for merging into the first line. Repeat this step.

    When determining the column type, I first initialize each column with a narrow type and then widen the column when necessary. I widen the columns in the following order: bool -> integral -> floating ->varchar. Because I am not completely sure that I merged the lines correctly, I relied on the probability rather than the first occurrence to widen the field. This allows me to run the analysis phase in only one pass. The drawback is that I do not have 100% confidence on the schema extracted; data that does not fix the schema would have to be rejected in the import phase. The analysis phase took about 10 minutes on a fairly slow (in today’s standard) dual-core Pentium machine.

    In the import phase, I did line merging similar to the analysis phase. The only difference is that, with statistics from the first phase, I was able to determine whether a short line (i.e. a line that has a small number of columns) should merge with the previous line or the next line. The criterion is that the fields split from the merge line have to satisfy the schema. I imported over 8 million records and had to reject only 1 record. I visually inspected the reject record and the data is indeed bad. I used SqlBulkCopy to load the data in 1000 record batch. It took about 1 hour and 30 minutes to import the data over the wan into our SQL server in the cloud.

    In conclusion, rather than determining the table schema fully deterministically, I employed a little bit of statistics in the process. I was able to determine the schema faster (with one pass) and higher fidelity (rejecting bogus data rather than accepting bogus data by widening schema). After all, statistics is an important part of machine learning today and it allows us to inject a little bit of intelligence into the ETL process.