April 2010 - Posts

jQueryTagCloud

Minifiers

JavaScript minifiers are popular these days. Closure, YUI Compressor, Microsoft Ajax Minifier, to name a few. Using one is essential for any site that uses more than a little script and cares about performance.

Each tool of course has advantages and disadvantages. But they all do a pretty good job. The results vary only slightly in the grand scheme of things. Not enough to make so much of a difference that I’d say you should always use one over the other – use whatever fits in with your environment best.

Tag Clouds

Anyway, it got me thinking. After crunching a script through one of these bad boys, what’s left?

The first thing I did was take jQuery 1.4.2 (the minified version) and push it into the tag cloud creator, Wordle. BAM! Beautiful, isn’t it?

It’s not that surprising that two of the longest keywords in JavaScript also happen to use up the most space: return, and function. What does it matter? The word function appears in jQuery 404 times, adding up to 3,232 bytes. That’s about 4.5% of the size of the library! return appears 385 times, adding up to 2,310 bytes, or 3.2%. So there you go – return and function make up a total of almost 8% of the size of jQuery!

Really makes me wish JavaScript had C# style llamas – err, lambdas.

Tokenizing

There are some problems here though. No tag cloud generator I could find was intended to be run on code. So it ignores things like operators, brackets, etc. And you know things like semicolons are frequent. Nor do they provide any kind of data feed of the document’s tokens. So, I created my own tool to generate the data.

Basically, I just have a list of possible tokens, and I run a regular expression on the code to determine how many times each occurs. Then, multiply by its length to get the total size of that token in the script.

Results

It’s amazing what the results show. The top 15 tokens represent 35% of the entire script, mostly single-character tokens:

jQueryTopTokens

It makes sense that function is the top token in size since 8 characters long, even though it only occurs 404 times. But look at “.”. Yes, dot. It’s only one character long, but it represents 2,565 bytes. “(” and “)” together make up over 5k. return, despite being 6 characters long, is in 5th place.

What does it mean?

Well, actually, the fact that these syntax related tokens are so high on the list is partially a testament to the effectiveness of minifiers. A minifier can only remove so much of the syntax, and you can’t shorten them. So if a minifier is doing it’s job, they should tend bubble up to the top of the list.

Thankfully, Wordle supports an advanced mode that lets you enter the tokens and their weight manually. Armed with the output of this tool, here is the entire result set in tag cloud form. The relative sizes of the tags aren’t really correct though, simply because a “.” is smaller in whatever font than any letters. Also, I don’t really know why the first use of Wordle produced a cloud that shows return bigger than function – I guess just a bug in how it counts. All the more reason to use the advanced mode.

jQueryTopTokensCloud

One thing it does is show ways that minifiers could do an even better job, or ways that we can code that reduce these tokens. For example, in theory a minifier could convert functions that use ‘return’ to assign to a parent-scope variable instead. That’s a fairly complex thing to do, so probably not worth it (performance seems equivalent, though). You can also try and structure your code so a function only has one ‘return’ instead of multiple.

This tool can help you find tokens in your code that use a lot of space besides these, too. For example, I applied this tool to the MicrosoftAjax.js script from .NET 3.5, and found to my horror that ‘JavaScriptSerializer’ was near the top of the list. And that is why in the AjaxControlToolkit you will find this script has been greatly reduced in size. Despite having many new features, it’s 10k smaller - in minified form - than it was in .NET 3.5., partially thanks to this tool to help me identify the areas that needed improvement.

Notice ‘getElementsByTagName’ appears in jQuery a noticeable amount. It occurs 17 times, or 340 bytes. Also not so obvious is how often characters ‘a’, ‘b’, etc occur. These are local variables that the minifier has converted to. ‘a’ is high on the list, since it is the first one used, but there are many in the top 100, all the way up to the letter ‘o’, totaling 5,228 bytes. So, a minifier could do well to understand how local variables are used and reuse existing ones when they are no longer needed.

The code is fairly simple. Again, this was something I wrote one evening off the cuff, so it’s not perfect (although, thanks to Brad and Damian for the nice LINQ way of converting the char array to a string array).


This is the code in C# for .NET 3.5.

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
 
public static class Tokenizer {
    public static IEnumerable<KeyValuePair<string, int>> Tokenize(string content) {
        string[] tokens =
            { "===", "!==", "==", "<=", ">=", "!=", "-=", "+=", "*=", "/=", "|=", "%=", "^=", ">>=", ">>>=", "<<=",
                "++", "--", "+", "-", "*", "\\", "/", "&&", "||", "&", "|", "%", "^", "~", "<<", ">>>", ">>",
                "[", "]", "(", ")", ";", ".", "!", "?", ":", ",", "'", "\"", "{", "}" };
        var escapedTokens = from token in tokens
                            select ("\\" + string.Join("\\",
                            (from c in token.ToCharArray() select c.ToString()).ToArray()));
        string pattern = "[a-zA-Z_0-9\\$]+|" + string.Join("|", escapedTokens.ToArray());
        var r = new Regex(pattern, RegexOptions.Compiled | RegexOptions.ExplicitCapture);
        return from m in r.Matches(content).Cast<Match>()
               group m by m.Value into g
               orderby g.Count() descending
               select new KeyValuePair<string, int>(g.Key, g.Count());
    }
}
 

You can also download the raw CSV file with the results for jQuery here.

Update: Thanks to David Fowler for linq-ifying the code and making it half as long!

Happy coding!

Hard to believe it’s been so long, but it was almost 4 years ago when I published Join the Dark Side of Visual Studio. That was when a lot of people were still using VS2003, and importing and exporting environment settings required a custom add-in, VSStyler, which has since fallen off the planet and is hard to find (link, anyone? Let me know). Three versions of VS later, and I’m still using and loving the dark side. Pleased, I am (haha). In fact, that article for one reason or another is still one of my most popular blog entries, thanks in part to a link from Scott Hanselman and a commenter on Coding Horror. I will point out selfishly that my article predates both of these :) But, yes, it’s sad when one of your top referrers is from a link in a comment on another blog. Not even the first comment, either.

That article even inspired someone out there to register a new domain: http://darksideofvisualstudio.net/

Now that Visual Studio 2010 is out, I decided to repost with my latest settings, exported from Visual Studio 2010. So here you go. It’s mostly the same as the original, but with some improvements. I lightened up some of the blues which are hard to read on some not-so-great monitors. Set the font to Consolas (though you may prefer Inconsolata). Also fixed a few neglected settings like the refactoring view and XML.

I should also point out that unfortunately, even in the new WPF based VS2010, you are very limited in what colors you can change in the environment itself, such as the Solution Explorer. However, there is a very nice add-in from the Visual Studio Platform team that lets you theme many aspects of the IDE that you can’t normally change. Not all, unfortunately, since not everything in VS is WPF yet. But if you use this in combination with the theme settings in windows itself, you can get pretty close to an all-dark theme in all windows. If anyone out there has success with this, please do share your exported theme settings from this add-in as well as your windows theme settings, I will be sure and update this article with the details.

Here’s a quick preview of some code from the AjaxControlToolkit:

DarkSide2010

Download The Dark Side of Visual Studio 2010

For older versions, see the previous article.

UPDATE: Rate the scheme on StudioStyles.info! The Dark Side of Visual Studio

More Posts