Frans Bouma's blog

Generator.CreateCoolTool();

Syndication

News



    Visit LLBLGen Pro's website

    Follow me on Twitter

    Add to Technorati Favorites

About me

Fun stuff I created

My work

"Cloud Cloud Cloud, if you're not in it, you're out!"... or something

After I graduated from the HIO Enschede (B.Sc level) in '94 I have worked with a lot of different platforms and environments: from 4GL's like System Builder, uniVerse and Magic to C++ on AIX to Java to Perl on Linux to C# on .NET. All these platforms and environments had one thing in common: their creators were convinced their platform was the best and greatest and easiest to write software with. To some extend, each and every one of them were decent platforms and it was perfectly possible to write software with them though I'll leave the classification whether they were / are the greatest and easiest to the reader. I'll try to make clear below why this dull intro is important.

Yesterday I watched the live stream of the PDC '09 keynote and in general it made me feel uncomfortable but I couldn't really figure out why. This morning I realized what it was and I'll try to explain it in this blog.

Cloudy skies

If one word was used more often than anything else in the keynote it was likely the word 'cloud'. Cloud, cloud, cloud, azure, cloud, cloud, azure, cloud, azure... and so on. Perhaps it's the weather in Seattle which made Microsoft fall so in love with clouds, I don't know, but all this cloud-love made me a little uneasy. This morning I woke up and realized why: it's too foggy. You see, the whole time I was watching the keynote, I had the idea I was watching the keynote of some conference about some science I have no knowledge about whatsoever.

"Cool, another guy talking about azure clouds with yet another set of fancy UIs I've never seen, giving me the feeling that not using those is equal to 'doing it wrong', but what the heck azure clouds are and what problem they're solving is beyond me". That kind of thing.

A long line of people were summoned on stage to tell something about some great tool / framework / idea / wizardry related to clouds and with every person I more and more lost grip about what problem they all wanted to solve. All I saw was a long line of examples of Yet Another Platform with its own set of maintenance characteristics, maintenance UIs, maintenance overhead and thus maintenance nightmares.

More UIs, more aspects about things which were apparently new to software engineering nevertheless utterly essential to writing good software... more UIs I've never seen before, more cloudy weather, more azure flavors, more UIs I've never seen, more...

"Aaaaarrrgg!"

As I've tried to explain in the first paragraph, I've been around the block a couple of times. I have lived through internet bubbles, read McNealy's 'The Network is the computer' articles / propaganda, shaked my head when I heard about Ellison's Java client desktop idea, waded through the seas of SOA and SOA related hype material, so I have a bit of an idea what "Big computer with software somewhere + you" means. In this 'modern age' it's dubbed 'Cloud computing', though to me it looks like the same old idea that has been presented by various people in the past but with new labels. With all these platforms presented in the past, there was really one issue: what was the problem they all tried to solve? Why would one want to use it? With Cloud computing, that same old issue hasn't been solved.

"I built it, you run it"

One aspect all these 'big computer with software + you' systems tried to sell was that they could run the software you wrote for you and you didn't have to worry about a thing. Well, not to worry about a lot, but still you had to worry about things, as the system was still Yet Another Platform with its own set of characteristics, flaws and weaknesses and most importantly: differences with the development- and test environment the software was written with.

The problem with software once it is written, tested and ready for deployment is that last stage: will it run in the environment on-site the way it runs locally in the test environment? And is that on-site environment easy to maintain?

In other words: the problem is that the environment the software has to run in isn't necessarily the same as the environment the software was written with / tested in, which could cause a lot of problems during deployment and after deployment. Other aspects like updating the environment due to security flaws, bugs in software etc. are also factors which add to the overall unpleasant experience of deploying and keeping software running.

So the answer to that problem should be a system which provides the following things:

  • The environment equal to the one the software was written and tested with
  • The resources to keep the software running when the software requires them.
  • The security that the software keeps running, no matter what.

In other words: the software engineers built the software, tested it and defined the environment (as they've done that for development and testing anyway) and shipped that in one package, and at the place where the software has to run, that exact same environment is provided, together with the resources required (like memory, cpu, a database connection). So "I built it, you run it". How the environment is re-created isn't important, the important thing is that the exact same environment is provided to the software, 24/7.

Are EC2, Azure and other cloudware solving the problem?

No. They provide Yet Another Platform but not the same environment. As they're yet another platform, you've to develop for that platform. The most typical example for that is that the newly announced application server from Microsoft 'AppFabric', has two flavors: one for Windows and one for Azure. Why would anyone care? Isn't it totally irrelevant for a system in the 'cloud' what software (or what hardware) it is running? All that matters is that it can provide the environment the developer asked for so the developer knows the software will run the way it was intended.

Let's look at a typical example: a website of some company with a small database to serve the pages, a small forum and some other data-driven elements, not really complex. Today, this company has to hire some webspace somewhere, database space, bandwidth and most importantly: uptime. To make the web application run online, it has to match the rules set by the hosting environment. If that's a dedicated system, someone has to make sure the system contains all software the web application depends on, that the system is secure and stays that way. If it's a shared hosting environment, the web application has to obey the ISP's rules of hosted web applications, e.g. can use 100MB memory max., can't recycle more than 2 times in an hour etc.

When Patching Tuesday arrives, and the web application runs on a dedicated server (be it a VM or dedicated hardware, doesn't matter), someone has to make sure that the necessary patches are installed, and that those patches don't break the application. Backups have to be made so if disaster happens, things can be restored. These all count as 'uptime' costs.

With a VM somewhere on a big machine this doesn't change, you still have to make sure the VM offers the environment the application asks for. You still have to patch the OS if a patch for it is released, you still have to babysit the environment the application runs in or hire someone to do that for you, but it always involves manual labor to make sure the environment online is equal to the environment during development and testing.

In the whole keynote I didn't hear a single argument how Microsoft Azure is doing this differently. Sure I can upload some application to some server and it is ran. However, not with the environment I ask for, but inside the environment Azure offers. That's a different thing, because it requires that the developer has to write software with Azure in mind. If I have a .NET web application running on a dedicated server which uses Oracle 10g R2 as its database and I want to 'cloudify' () that web application with Azure, I can't because I have to make all kinds of modifications, for example I have to drop the Oracle database for something else and also make other changes as the environment provided by Azure isn't the same as the one locally.

EC2 and other cloudware do the same thing, they all provide 'an' environment with a set of characteristics, but not your environment. So in other words, they're not solving the problem, they only add another platform to choose from when writing software. Like we didn't have enough of those already. Sure, they offer some room for scaling when it comes to resources, but what happens when the image has to reboot due to a security fix that had been installed? Is the application automatically moved to another OS instance? Without loss of any data in-memory, so it looks like the application just ran along fine without any hiccup?

So what's the solution? What should Cloud computing be all about instead?

It should be about environment virtualization. I give you a myapp.zip and an environment.config and you run it. And keep running it. All dependencies on software of my application, like 3rd party libraries, are enclosed in the application's image. That's not an image of an OS with the app installed, it's just the application. The environment.config file is a file which contains the description of the environment that the software wants, e.g. .NET 3.5 sp1, Oracle 10g R2 database, 2GB ram minimum, IIS7, domain name example.com registered to app, folder structure etc. etc. So I outsource any babysitting of the environment of my application.

That is incredibly complex. It might not even be doable. But it's the only way to make cloud computing something else than a new name for an old idea, despite the long list of well-known names who showed an even longer list of UIs and tools during a keynote.

Can Azure do what I described above? I honestly have not the faintest idea, even after watching the keynote yesterday and by reading up some marketing stuff. That doesn't give me confidence, as it's in general not a good sign if a vendor has a hard time explaining what problem a product solves.

Posted Wednesday, November 18, 2009 11:16 AM by FransBouma | 12 comment(s)

LLBLGen Pro v3.0 sneak-peak video

I created a small video (flash movie) of a neat feature of the upcoming LLBLGen Pro v3.0 designer: creating a typed list definition from search results obtained in the designer by running a custom piece of code (C#, with Linq to objects. VB.NET is also supported)! So any query you want to run on the model meta-data is allowed.

Please click on the screenshot below to open the page with the video. You need flash to play the video. No sound included.

LLBLGen Pro v3.0 is scheduled to go beta at the end of 2009 and will support the LLBLGen Pro runtime framework, Entity Framework, Linq to Sql and NHibernate.


Please click the screenshot to view the small video. (opens in new window)

Update: uploaded a better html file, so the video isn't resized improperly.

Posted Monday, October 19, 2009 1:27 PM by FransBouma | 4 comment(s)

Happy 6th anniversary, LLBLGen Pro!

Today, it's been exactly 6 years ago we released the first version of LLBLGen Pro, v1.0.2003.1 after a development period of roughly 9 months (Sunday september 7th 2003, late in the evening). It was a big gamble, would it succeed or fail? We got our first customer within 9 minutes after release and we then knew it would be a success. And it still is, with thousands of companies using it world-wide, from small mom & pop shops to the biggest banks on the planet. Honestly, we hoped for success but that it took off this big was beyond our expectations. A big thank you! to all of our loyal customers who trusted our work in the past 6 years and who are keep trusting it.

Needless to say, we're still going strong and are looking forward to v3.0 which is scheduled to go beta at the end of the year. It will actually be our 10th major version (1.0.2003.1, 1.0.2003.2, 1.0.2003.3, 1.0.2004.1, 1.0.2004.2, 1.0.2005.1, 2.0, 2.5, 2.6) since the initial release, and will be the first release which will support other frameworks besides our own runtime framework and will also add another major new approach: model first.

Looking back at those 6 years, I think the biggest asset we deliver is quality you can count on. From the get-go we strived for that aspect, with top-notch support which is free and bug-fixes which are usually delivered within 24 hours. A data-access technology isn't something you just pick out of a pool of tools, it has to fit your way of how you want to write software and work with data, what you want to do in your application and above all, has to be rock-solid so you don't run into surprises, unexpected lack of support for common features or a wall of disbelieve when you ask for help or support or a bugfix. So in other words, a data-access technology is one of the pillars your software has to count on. From the start we realized this and with every feature we added we made sure that indeed, our customers could indeed count on our work and the quality we deliver.

During these 6 years, we worked full time on implementing more features, like a new paradigm (Adapter), support for more databases, multiple ways to do inheritance, more powerful code generator engines, template editor, linq provider etc. and it was and still is simply great working on this every day. On to the next 6 years!

Posted Monday, September 07, 2009 10:46 AM by FransBouma | 21 comment(s)

LLBLGen Pro and SQL Azure

LLBLGen Pro works with SQL Azure, that is, the generated code and the runtime library. There are a couple of things you should be aware of, and I'll enlist them briefly below. The thing which doesn't work is creating a project from a SQL Azure database, as SQL Azure has no meta-data tables publicly available to the connected user (also a reason why for example SQL Server Management Studio doesn't work with SQL Azure at the moment)

The things to be aware of are the following when you want to work with SQL Azure and LLBLGen Pro are the following:

  • SQL Azure doesn't support catalog names in the queries. As LLBLGen Pro supports multiple catalogs per project, and thus cross-catalog queries, you can only use one catalog in your project.
  • To avoid catalog names in the queries, you should use the feature called 'Catalog Name Overwriting', which simply means that you configure the runtime to use a different string than the catalog name. You should configure the runtime to overwrite the catalog name of your project to "", so the catalog name is not emitted into the SQL query.
  • Our tests and those performed by some of our customers showed that if you use a schema which isn't the default schema, it also seems to make SQL Azure throw errors. So to be safe, either use 'dbo' as the schema, or if you must: define the used schema as the default schema of your user using:
    ALTER USER username WITH DEFAULT_SCHEMA = schemaname

That's it. If you make sure of that, which are a simple couple of steps to check, you can use LLBLGen Pro generated code on SQL Azure. Happy azuring!

Posted Saturday, September 05, 2009 3:48 PM by FransBouma | 1 comment(s)

I'm now also on Twitter

Direct profile url: http://twitter.com/FransBouma

I don't promise to follow everybody, but for the few people who want to follow what I have to say, I'll try to use it for more smaller blurps than this blog, as my blogposts here seem to be pretty big (and time consuming to write) overall.

Posted Wednesday, August 19, 2009 6:16 PM by FransBouma | 3 comment(s)

Think first, 'doing' is for later

In the comments section of Ayende's blog, I recently debated the usage of principles like the ones in SOLID and argued that these principles aren't really the important thing to focus on. Instead, people should focus on thinking. In the Netherlands we have an old saying: "Bezint eer ge begint", which translated to English is something like "Think everything through before you start". Now, before I wake up the anti-Waterfall people, I'd like to add that this post isn't about Waterfall at all. Instead, I'd like to line out how I write my software, how thinking is an essential part of every step I take in the whole process and will illustrate it with an example which hopefully will illustrate that some extra time spend on the thought process before writing any code is very valuable.

The main thing to understand about this post is that it's not a guide to design a whole system as it is about how to approach and successfully implement features. Features are pieces of functionality the software needs to have at various abstraction levels. TDD people might describe them as user stories, I stick with feature because I don't use TDD so I use a different name to avoid confusion. The example feature I'd like to use in this case is the following: say you are working on an O/R mapper designer and you have the ability to define value types. Value types are types which have one or more attributes (fields) but don't have an identity of their own like Entities do.

A good example is 'Address', with the fields 'StreetName', 'City', 'Zipcode' and 'HouseNumber', but you could also define value types with just one field, like 'EmailAddress' with the field 'Value'. A value type with just one field can be useful if you want to place logic related to that single field (like in this case the check if the value for the email address is valid) with that field and do that just once: every time you now need to define an EmailAddress field in an entity, you can set its type to 'EmailAddress' instead of 'string' and validation is built in, as you already did that once for the value type 'EmailAddress'.

This feature will give a couple of problems, where one being pretty complex: if you want to add the feature to add new fields to a value type definition, you might run the risk of creating an infinite loop: a field in Address which also is of type Address. Or worse, you could have a field Foo inside Address which has a field which is of type Address. How do you prevent that from happening? I'll show you how to solve this in a general way and also how to get there.

Let's go back to our feature: adding a field with a valid type to a value type. Which steps to take to add this feature successfully? The steps to take are the same steps I'll always take with every feature, I've summed them up briefly in the list below:

  1. Think first. 'Doing' is for later. Think everything through, use reasoning to learn more about the feature, try to think of all possible problems related to the feature and possible solutions.
  2. Analyze what you need based on step 1. Try to find generic, proven algorithms and data-structures which might be what you need, and build on top of that.
  3. Document the results of step 1 and your analysis result from step 2: write which alternatives you've investigated and which one you've picked and most importantly, why.
  4. Prove your algorithms first. This is not that hard and doesn't require any code
  5. Implement algorithms / data-structures as designed. Then check whether you implemented the algorithm correctly. This is doable by simply looking at what you wrote. How you implement it, is dictated by the algorithm: which steps are there to take.
  6. Test the implementation of what you wrote. As the algorithm is already proven, the tests prove the implementation. If you want, you can also start with this before 'Implement algorithms' by implementing it using mocking and tests. That's not the point, the point is that the code is the end-result, not the start.

I gave them numbers in the list above to easily refer to them. Let's address them one by one and along the way see how our example plays a part in this.

Step 1: Think first

This might sound like pretty common sense, but it's the most important step. You should spend considerably time in this step, thinking things through, what is the feature all about? Which side effects can be recognized? Should you split it up in various sub-features/parts? If you don't feel comfortable with what you have discovered during this step, like you don't really have a good feeling about how to approach it further, you shouldn't proceed yet. This step is the foundation of what you will do in future steps. What's also important to notice: no coding. This is all thinking, not doing. The doing is for later isn't there for nothing. The most important mistake people can make is act before thinking things through. You'll see why in a bit.

In the feature at hand, adding a field to a value type, we can discover several things: we have to validate on the name within the value type, the field has to be correct by itself (e.g. has to have a valid type, name etc.). We also recognize the necessity of avoiding an infinite loop within a value type as described above: we can't add a field F to value type VT if by adding that field, an infinite loop is created inside VT, or indirectly in a containing type of VT. A field can get a different type, which also should avoid this infinite loop. We can also recognize the requirement that when we add a field to a value type VT, this field is then also required to be mapped onto a target, but for the sake of simplicity, we don't go further in the mapping scenarios here.

I also leave a notification system for other subsystems and undo/redo outside the scope of this post but the requirements for these aspects are the same: they too have to be thought through, how to solve them and have we solved them before already, if so, how did we do that? An experienced software engineer solves things often in the same way as s/he already knows what to do when different kind of small problems occur. Software engineers which aren't that experienced are often faced with these decisions and don't really know what to do. If you find yourself in such a position, don't be arrogant and deny it, but simply take the steps of thinking it through, try to look if (based on theory!) you have made previous attempts to solve the same problem elsewhere.

After thinking the feature through, we look at our findings, and see that for the most part there's some straight-forward validation needed for the name and the correctness of the field and also a complex validation for the infinite loop protection. As we have thought about our feature and the implications, we can proceed with the next step. Though, don't have the illusion that the thinking is now done, the next steps still require a lot of thought.

Step 2: Analyze what you need

When analyzing the results of step 1, we see there are three kinds of validation we have to solve:

  1. Validation of the field itself (does it have mandatory properties set?)
  2. Validation of the field within the value type it is added to (does it have a unique name within the value type?)
  3. Validation across all value types (does it create directly or indirectly an infinite loop?).

The validation for a) is straight forward, and we have several standard approaches for that: add a validation method to the class or a property itself which signals that the field is valid, and internally the field maintains the value for this by checking itself whenever it changes. There might be others as well, all have their strong points and weak points. As this is a common problem, it's likely that in an earlier feature it already has been solved, so we can leverage that analysis and see if it applies here too.

The validation for b) is similar to a), we simply apply it at the value type level instead of at the field level. We can re-use our analysis results and verify whether it applies here too. Validation for c) is something else though: how to approach this? It depends on what you like most: visual approaches on a white board, or reasoning in theory without images. The key is that you avoid writing code. This is perhaps going to be considered a bad thing in some people's eyes, but it's key here, as writing code makes you fall into the trap that you'll think that the code is what you are trying to express in executable form, but it's not, as you haven't thought of what you want to express in executable form (as you haven't decide what algorithm / data-structure to use!) so you can't possibly write code which represents that.

The problem of c) is at first pretty straight-forward until you discover that it is actually less simple because you have to investigate a lot of different paths and different situations, as the value type VT1 you add the field to might be inside another value type VT2, which is inside another value type VT3 so the field to add isn't allowed to be of type VT1, VT2 and VT3. To find all those paths requires an algorithm which keeps track of all kinds of things, which makes it less easy to implement, maintain and test. So what's there to do?

We first step away from the idea that we're the first to solve these kind of problems. To be able to find a general theoretic solution which is already proven to be correct, we have to make our own problem more generic and realize that others before us have already solved this same problem with theory (not code! This isn't about copying pieces of code from Google, it's about re-using theory). In this case, we have to analyze when an infinite loop occurs in our situation, in short: what are the criteria for that. It comes down to: which types should a field F not have if we add it to value type VT1? Obviously VT1, but which value types should we also avoid? If you draw a picture of this on the white board, you'll see that every value type known in the system (so every value type the field F can have) which is directly or indirectly referring to VT1 is not allowed, as by setting the type of F to one of these value types will automatically create an infinite loop.

If you don't see it directly, draw a picture on a piece of paper or whiteboard of the following: VT3 points to VT2. VT2 points to VT1. VT4 points also to VT2. VT5 points to VT6. How to find the value types referring to VT1 directly or indirectly? Look at your drawing at the whiteboard. It will likely look like a graph, a directed graph. Graphs are one of the most important data-structures you'll need in software engineering. One of the key aspects of graphs which makes them so great is that for graphs (directed, non-directed) a lot of algorithms have been discovered and described (and proven to be correct) which we can re-use without doing any effort. The nice thing about these algorithms is that they often solve problems we face every day (like ordering elements which are related to each other, or this one: which paths are there?).

The algorithm we need here is Transitive Closure. Transitive closure gives all pairs of vertices ('nodes') in the graph which are directly or indirectly connected to each other. So in our graph example above, it will give VT3 -VT2, VT3 - VT1, VT2 - VT1, VT4 - VT2, VT4 - VT1, VT5 - VT6. This means that I can travel from VT3 to VT2 and VT1, from VT2 to VT1, from VT4 to VT2 and VT1 and from VT5 to VT6. This also means that all value types from which I can travel to VT1 are therefore not allowed, as these all refer directly or indirectly to VT1! This leaves us with VT5 and VT6 which are the only value types allowed for a field F added to VT1.

We're not done yet. If you dare to look at the wikipedia page I linked to above, you'll likely stall at the complex math formulas you're faced with. Don't worry, these are part of the mathematical background of the concept. As we're software engineers, we need to look for theory about an algorithm which describes transitive closure of a graph G so we can implement it. As this is a problem which has been solved before us, some very clever people have already come up with what we're looking for: there's a very efficient algorithm designed and proven for transitive closure: the Floyd Warshall algorithm.

You might now wonder how you would have gotten to the same conclusion. One of the most important steps is to avoid home-brew algorithms unless you really really have to. A typical bad habit of software engineers (and don't worry, I also still make that same mistake once in a while) is that they think they're the first to face a problem and also that there's nothing generic discovered yet which could possibly help them out so they have to cook something up themselves: the home-brew algorithm. The downside of home-brew algorithms is that they're not yet proven to be correct. For general algorithms like the Floyd Warshall algorithm, it is proven to be correct so we can skip that important step: we don't have to think about if it will work with whatever graph we throw at it, it will. We just have to type in the three nested loops and we're done.

Now that we've done our analysis and have come up with good candidates of how to solve the problems we faced with this feature, we can proceed with the next step.

Step 3: Document the results

Software engineers tend to dislike to document their work, however it's very important that we document our thought process, and especially why we took decision A and not B. The key of documenting the design decisions and which alternatives were not chosen, is that if we have to face a similar choice again, for example in two years time we have to alter this piece of code and have to look up why the feature was designed with a graph and a Transitive Closure algorithm, we will learn from the design decisions that the graph approach was a good one because it was a proven path: we don't have to worry about the fact if the algorithm would give us all the paths we would be interested in, it will. So we can keep that implementation and don't have to worry about alternatives being better: we can learn from the documentation we made, the alternatives are not better. This documentation for this feature doesn't have to take ages to write, it might even be half a page, as long as it contains the information that explains why which alternative is chosen.

After documenting our findings from the first two steps, we proceed with the next step.

Step 4: Prove your algorithms first

Here we'll see that what we have invested in pays of. We use a proven algorithm so we don't have to do any work, it is already proven to be correct. Would we have chosen an algorithm we designed ourselves, we would have to prove if it works. This can be a bit time consuming and, I'll admit, boring, but it is worth it. One of the key aspects of proving an algorithm is that you think each step through, you define the pre-/post conditions, what will make it go wrong, and perhaps even you'll see flaws in the algorithm and have to start over. It's easier to do this without code, because code can contain bugs due to bad implementation. If you write tests for your code (or start with tests when implementing code), your tests not only test the implementation but also the algorithm. If the test fails, is that due to an algorithm error or due to an implementation error or both? If you prove your algorithms first, you know it's an implementation error.

Some say that this step is not doable or it's even worthless. However it's easier than you might think: write pre/post conditions down for each step, and reason about the algorithm you designed, when will each step break and why not? It doesn't have to result in hard-core math, it's often already enough to think through each step of the algorithm what the pre/post conditions are and when a step will fail to spot problems.

As we've already proven our algorithm by pointing to the work of others (Floyd and Warshall) we can move on to the next step.

Step 5: Implement algorithms / data-structures as designed

This is the step where we actually will write code. For the people using TDD, you will likely combine this step with the next one by writing tests first and using mocking to work towards a working implementation, but that's really semantics. In step 2, we decided to use a directed graph algorithm, so we need a graph data-structure which can handle directed edges. Writing a graph class is straight forward, the only thing you have to make a decision on is how to store which vertices are connected: using adjacency lists or by using an adjacency matrix. Both have strong points and weak points, you can learn more about them by reading the linked wikipedia articles. You can also decide to use a prefab graph class, as there are several graph classes already written for .NET, it's up to you.

Once we have the graph class, we can implement the Transitive Closure algorithm. As this algorithm is really about three nested loops, it's straight forward. Below is the Transitive Closure implementation of Algorithmia's DirectedGraph class:

/// <summary>
/// Returns the transitive closure of this graph using the Floyd-Warshall algorithm.
/// See http://en.wikipedia.org/wiki/Transitive_closure and http://en.wikipedia.org/wiki/Floyd-Warshall_algorithm.
/// </summary>
/// <returns>The transitive closure of this graph.</returns>
public DirectedGraph<TVertex, TEdge> TransitiveClosure()
{
    DirectedGraph<TVertex, TEdge> result = new DirectedGraph<TVertex, TEdge>(this, this.EdgeProducerFunc);
    if(this.EdgeProducerFunc == null)
    {
        throw new InvalidOperationException("The EdgeProducerFunc isn't set to a value.");
    }

    foreach(TVertex i in this.Vertices)
    {
        foreach(TVertex j in this.Vertices)
        {
            foreach(TVertex k in this.Vertices)
            {
                if(!j.Equals(i) && !k.Equals(i))
                {
                    if(result.ContainsEdge(j, i) && result.ContainsEdge(i, k) && !result.ContainsEdge(j, k))
                    {
                        result.Add(this.EdgeProducerFunc(j, k));
                    }
                }
            }
        }
    }
    return result;
}

Is this implementation correct? We can of course test this with some unit tests. We can additionally to these tests, check whether we have indeed implemented everything correctly by looking at the steps in our algorithm and then go to the code to see if we made a proper projection of the step to the executable form: the code. If not, we have to change the code, not the algorithm. Don't think lightly of this and be careful not to cut corners by 'assuming' it is OK. A human isn't a very good source code interpreter but the developer of the code should at least try hard to check whether the implementation is indeed how it should have been. That already will catch obvious bugs and mistakes, as the algorithm is correct so we have to worry only about the implementation.

There are several implementations possible of a given algorithm A. It depends on how you interpret each part of the algorithm and how you think it's best to implement these parts. This might cause problems but these are caught by this step as well because you've to verify what you wrote is indeed what you should have written. This is also the place where code reviews come into the picture and the reason why they work: other people will look into how an algorithm is implemented exactly, and as each person will actually make the same projection from the algorithm to code, any difference in what they would have done themselves vs. what they are reviewing will trigger discussion and will in the end result in better code.

One might wonder if this is really worth the effort. Yes it is. The exact same algorithm implementation above for example is also used to find inheritance loops when inheritance is defined between entities (as it's in fact the same problem) and you can solve other problems with the same algorithm as well. Furthermore, the more time you spend on making sure the code is actually of high quality and correct, the less time you'll spend on maintenance later on or bug-fixing, as the code and more importantly, the reasoning behind it, is well understood and debated, analyzed, documented and considered the best alternative of the list of possible options.

For this particular feature we picked a data-structure and algorithm which were already well-known. As we've implemented them in our code-base, we can re-use these general building blocks whenever we need to solve similar problems. The advantage of using well-known data-structures and algorithms is that they're not subject to change because they don't evolve: their definition is well-formed, the problems they solve are well defined and therefore it's safe to use precisely these kind of building blocks to base your own code on instead of home-brew data-structures and algorithms. Always try to use general, well-known data-structures and algorithms. The more the better, and once implemented, you can re-use them without worry: a transitive closure is always a transitive closure. A topological sort is always a topological sort and so on.

Let's move on to the final step.

Step 6: Test your implementation

If you're using TDD, you likely already have the tests written as part of the previous step. If you're not using TDD, you should now test your implementations of the parts in the previous step. As the algorithm or algorithms and data-structures, are already proven to work, we can focus on if we have written the right code to implement them. Because of the work in the previous steps, we have in-depth knowledge of how things should work, what each step in the algorithm should do, and thus what happens in each step of the feature. This leads to tests which test on cases where we can expect problems, for example around the pre/post conditions we found. Keep in mind that in our example, there is an unlimited amount of graphs to test, and you can't write tests for every single one of them. Using proved algorithms helps tremendously in this case, as the proving of the algorithms already solves you from the quest to test for every possible input. Let me violate DRY and repeat myself here: Always try to use general, well-known data-structures and algorithms. The more the better.

When our tests succeed, we can be pretty confident that our feature works. The way we implemented it from start to end has given us a lot of valuable assets: we have documented design decisions for later use, for ourselves but also for others who have to maintain our work, we have written generic data-structure and algorithm classes of proven algorithms which we can re-use in a lot of other situations, and we have a working feature which is based on theory and the reason why the code is the way it is can be tracked back to the theory and information we formulated and collected in the early steps.

Conclusion

In this post I tried to shed some light on a different form of software engineering, which is based on a thought process to base code on a solid theoretic foundation. I deliberately avoided any pattern, fancy methodology or principle like the ones from SOLID, because I wanted to focus on the thought process behind writing code, why we write the code we write and that that process doesn't started within a code editor but that it ends within the code editor and starts somewhere else. I have tried to illustrate how I write my software on a daily basis and hope it is valuable to others.

Further reading

For the people interested to read more about algorithms and data-structures, I've compiled the small list of links below.

Posted Sunday, July 26, 2009 5:44 PM by FransBouma | 17 comment(s)

Follow-up on the 'Firefox v3.5 fiasco'

(Follow up to: The Firefox 3.5 fiasco)

I'd like to inform the audience that the people over at NSS, the sub-system which is responsible for the disk-trashing behavior of Firefox 3.5 (and the accompanying delays on startup) on some systems, has worked on a fix for this which appears to be scheduled for FF 3.5.1. You can read the discussion by starting here (which lands in the middle of the bug comments, but the comments above the one linked are basicly bickering comments over what to do to the symptoms instead of really fixing it at the root)

It's good to know that the NSS folks finally listen in and will use CryptGenRandom when available (it's a windows subsystem method) and will only revert to disk-based entropy collecting when CryptGenRandom isn't believed to be as solid as it is on 'modern' OS-es like Windows XP and up. I still think that MS has patched Win2k's kernel code enough to make CryptGenRandom (which is essential for the TCP stack as well) solid enough, but it's their call (IMHO, people should make choices based on evidence based arguments, but as Win2K is rather old and no longer supported by MS anyway, it's not such a big deal)

Let's see whether this patch will turn out to be as good as it looks today. I'm glad Mozilla is keen on fixing this pronto, as FF 3.0 is scheduled to be non-supported software starting in January 2010.

So what can be learn from all of this as a developer? In my opinion, the true lesson to learn here is that no-one is perfect and that it's key to keep listening to what our users experience when using the software we wrote, so problems can be solved better and choke points can be dealt with. It's all too easy to simply close the eyes and ignore problems reported by perhaps a minority of the users by cooking up excuses for not dealing with them, but that's not the solution: the problems won't go away by ignoring them, the vocal minority might actually be representing a big non-vocal group or worse: a big non-vocal groups of ex-users. That's not to say that every problem ever reported by a user should immediately be fixed: unless you have unlimited time and resources, it's practically not doable to achieve that, but we should at least try and investigate whether these reports might cover bigger problems, might affect bigger groups than a sole individual.

Posted Saturday, July 11, 2009 11:02 AM by FransBouma | 10 comment(s)

The Firefox 3.5 fiasco

(updated: replaced 'trashing' with 'thrashing' as indeed, I meant 'disk thrashing').

As a Firefox user, I was delighted when Mozilla released Firefox v3.5. It was advertised as a new milestone in browsing, with more standards being supported, new engines for javascript and web content rendering and the intarweb would appear to me faster than ever before. As a person who has a bit of a blind spot for marketing and everything related I was a tad skeptical, but I thought "What the heck!".

So I went ahead and downloaded the installer at release day and after fighting with the usual plug-in upgrade mess, I was able to run the browser for the first time and lo' and behold, the web felt like I was back in 1994, when no-one but the Real Geeks had web sites and everything was lighting fast. Live was good.

The next day, with a fresh cup of coffee in my hand I started my beloved Firefox 3.5 browser on my freshly booted system. I was expecting to see the browser dialog within seconds to re-experience the web at light speed, but nothing happened. Well, something did happen, my PC's hard-disk was busy like I was running three virus-scan sessions at the same time. After 35 seconds or so, it finally managed to find all the bits and pieces it apparently needed and showed me the familiar face of the Firefox browser dialog and I was on my way to the outside world!

Suddenly, a small, screechy voice in the back of my head tried to make a point. That voice, which sometimes cries through a developer's head when s/he writes a piece of code which isn't in the format the voice owner likes, at which point it desperately tries to convince you to do something else instead by giving unwanted advice like "Wouldn't it be better if... " and other lovely comments no-one wants to hear, that voice made a remark in it's usual dull way about those 35-something seconds before the browser really started. As with similar occasions, I didn't pay much attention to it. Every Firefox instance I started was lighting fast, and showed up 2 seconds or even faster, it must have been something else which had caused the delay at startup. I know, it didn't sound very convincing the first time either.

That afternoon, I had no browser window open and started a new one. Again I was rewarded with a long pause, disk crunching and a blank screen, until 30-35 seconds had passed and Firefox 3.5 was awake and ready for duty. "Hmmm..." I thought. Voicy in the back of my head was awake again too, with random babbling about "Told you so", "I'm not gonna repeat myself" and similar wisdom, the usual. Could it be windows or some service caused all this disk thrashing and the delay? I more and more got the feeling Voicy was right (I hate that feeling) and there was something fishy about all this.

I didn't want to wait 35 seconds every time I started a browser, so I wondered what to do. Then I realized I was an end-user of this application, this browser. And what do end-users do? That's right, they go to the support offering of the vendor. Mozilla has a nice forum system so I searched it a bit to see if fellow Firefox users had similar delays. Well... you could say... yes indeed. And not only delays of 30-40 seconds but some had to wait minutes or even worse: after starting Firefox, it went into a coma and never truly woke up. Mozilla also found out that more and more people had the same problem and added a sticky thread to their forum. You can reach it here.

That forum thread revealed what the true cause was of this disk thrashing and delay at startup. I have to warn you though. If you're a developer, your software engineering fire will die a little when you read the true cause and from then on you will have to fight off thoughts of giving up development altogether and apply for a job in marketing or HR. So what was it, what's the cause of this slowness? It's NSS. What? The Network Security System. It turns out that NSS needs to do all kinds of encryption and other security related tasks (which seems kind of logical), and for that it needs random numbers. Sounds reasonable, right? Well, it kind of does.

True random numbers are hard to produce, because in a computer system, nothing is really random, it all is a result of some action which was a result of some action etc. etc. The clever boys and girls of the NSS team had to crack this problem: how to get 'true' random numbers which are as random as possible? Instead of using the randomization functionality of the underlying operating system (which has this feature build-in as every TCP stack for example needs it), they did what Mozilla in general always does: they re-invented the wheel. Nothing against re-inventing stuff, don't get me wrong, not every wheel is as equal as the other one, and you can never have enough good, re-invented, shiny wheels. Though, the downside of re-inventing wheels is that along the way you can't make mistakes, it has to be better than the previous invented wheels. No-one wants to use your square new wheel for example.

To solve the problem of the randomization, the NSS team came up with something clever, something so great, that no-one else had ever thought of that before: they decided to read the files in all possible temp folders on disk with multiple threads so these files can be used as seeds for the randomization. Brilliant. Temp folders! Why hasn't anyone else thought of using a disk-based resource for random number generation! I mean, these folders change every couple of milliseconds, have immediate access, no latency to read their contents and are never filled to the brim with useless cruft!

That is, if you're on the NSS team. In the outside world, things are a tad different. You see, Firefox v3.5 reads the Internet Explorer Cache and the central Windows temp folder in your user profile, through its NSS subsystem. Not only is it, in my humble opinion, not done to read another application's caches or temp folders, it's also amazingly ignorant towards the real bottlenecks of our modern computers: hard-drives. If you're using a virus-scanner which is set to paranoia mode, this whole temp folder traversal by NSS will be even slower because every file accessed will be scanned by the virus scanner. Over and over and over again. And what happens if the user doesn't do anything else but browse with Firefox, so these temp folders will not change (or are empty)? Isn't using file reading the worst way to obtain a seed for randomization?

I used sysinternals' Filemon tool to check which folders and files Firefox was reading and along the way I also saw they read all fonts up front. All of them. That too seems rather odd for a browser who claims to be the fastest browser. How many fonts do you need on a random (pun intended) webpage? Besides the default ones and a few common ones? 2, 3? Would it hurt anyone if these are read 'on the fly'? Not compared to the delay in startup time for a browser dialog when you have many fonts installed.

NSS is open source, but it's not something you can fix yourself, unless you compile the browser as well. The problem is that NSS is a security component and therefore needs to be signed by Mozilla to be used in Firefox. This means that recompiling the NSS dlls won't work, Firefox won't accept them (which is logical, it's a vital part of the security system, heck it is the security system!). Though, why should I even bother? It's 2009, for crying out loud. After 15 years of web-browser development, the human race should have produced a web-browser by now which is worth using, without silly startup delays which last minutes or even longer. After all, in this case, I'm an end-user.

Mozilla on their forum says 'a' developer is working on a fix and they 'hope' that this developer is able to fix it. That's not sounding very promising to me. This is a top priority issue, Mozilla, unless you want droves of people drop your browser for the competition. There's already a fix available, Firefox v3.0 didn't do this disk thrashing and is able to communicate security over the internet, at least that's what you always told your users. In other words: the NSS version in Firefox 3.0 was capable of creating random numbers and doing encryption without the necessity of reading a competing browser's disk cache nor the OS' temp folders. In case you wonder, Mozilla, no, I'm not going to advice friends and family members to use Firefox 3.5 anymore till this is fixed. Not that you nor your droves of developers lose any sleep over that, at least I hope not, but with me I'm pretty sure more people will do the same: move away from Firefox or revert back to an older version and wait with advice to friends and family about Firefox 3.5.

I'll revert back to Firefox 3.0 till this is fixed, or move to another browser (although I find Chrome a bit too much Google in one package). If you're planning to upgrade to Firefox 3.5, be aware of the issue I described above and do realize that it's not something you can learn to live with, as the delay will occur randomly (pun intended) during the day: sometimes starting a browser is fast, however an hour later it can take again 30-40 seconds or longer.

Yes Voicy, I'll listen to you more. At least more often.

Posted Thursday, July 09, 2009 10:59 AM by FransBouma | 169 comment(s)

Linq: Beware of the 'Access to modified closure' demon

If you're using Linq and Resharper, you've probably seen the warning Resharper shows when you use a foreach loop in which you use the loop variable in a Linq extension method (be it on IQueryable<T> or IEnumerable<T>). In case you don't know what it is or what damage it can do if you ignore the issue, I'll give you a database oriented query (so on IQueryable<T>, using LLBLGen Pro's Linq provider) which creates a dynamic Where clause based on input, the typical scenario you should be careful with when it comes to this particular problem.

var customers = from o in metaData.Order
		join c in metaData.Customer on o.CustomerId equals c.CustomerId into oc
		from x in oc.DefaultIfEmpty()
		select new { CustomerId = x.CustomerId, CompanyName = x.CompanyName, Country = x.Country };

string searchTerms = "U A";
var searchCriteria = searchTerms.Split(new string[] { " " }, StringSplitOptions.RemoveEmptyEntries);
foreach(var search in searchCriteria)
{
    customers = customers.Where(p => p.Country.Contains(search));
}
var ids = (from c in customers select c.CustomerId).ToArray();

The above code snippet has the demon embedded into itself, likely without you noticing it. Can you spot it? (Ok, I already gave it away a bit with the foreach loop hint).

The problem with the above query is that it will produce a WHERE clause in the SQL query with two LIKE statements which both filter on %A%. How's that possible? The cause is in the 'Access to modified closure' problem: search is a local variable. The first time the foreach is ran, search will have the value "U". The .Where() extension method will add a MethodCall expression with a call to Where(lambda) with inside the lambda among other things a ConstantExpression referring to the local variable search for the value. And that's precisely the problem: when the foreach loop is looping again, search will get another value: namely "A". As there are no more values, the loop ends and the query is executed.

Well, executed is more complex than it sounds: first, the expression tree has to be converted into SQL. When the linq provider runs into the two .Where() extension method calls, it evaluates the argument, which is a LambdaExpression which contains a ConstantExpression which refers to... a local variable called search. It can't do anything else but reading that variable, which has the value ... "A" for both, as it reads the same variable. So it's not storing the constant value search has when the call to Where and Contains is made, it's storing a reference to the local variable.

How to fix this? It's pretty straight forward: create a new local variable:

foreach(var search in searchCriteria)
{
	var searchTerm = search;
	customers = customers.Where(p => p.Country.Contains(searchTerm));
}

With each iteration, it creates a new local variable, and thus each Contains call will refer to a different variable and thus the SQL query will contain the two LIKE predicates the way it should, one with %U% and one with %A%.

This subtle issue pops up with Linq to Objects as well, so beware when you pass the foreach loop variable to a Linq extension method: if the query doesn't run at that same spot, you likely will run into this problem and will have an obscure bug to track down.

Happy hunting

Posted Thursday, June 25, 2009 10:32 AM by FransBouma | 15 comment(s)

Multi-value Dictionary C# source code (.NET 3.5)

By popular demand, I've published the C# source code of my Multi-value Dictionary class, which can also merge dictionaries into itself and which implements ILookup<T, V> as well. It's part of Algorithmia, our upcoming data-structure and algorithm library which will ship with LLBLGen Pro v3.0 later this year. The code is released under the BSD2 license, see the enclosed readme.txt. The class comes with its own general purpose Grouping<T, V> class as well and of course its own ToMultiValueDictionary() extension method.

I hope this is useful to others.

Update: it seems that if you run a Linq query (Linq to objects) over the MultiValueDictionary, the compiler and intellisense get confused as there are now two enumerators and both work with the linq operators, which means you either want to remove the ILookup code from the class (which is not that hard) or explicitly state the generic arguments. It's not a big problem, though in case you run into this problem, you know the reason.

Update 2:A user pointed out that I forgot to include ArgumentVerifier, a simple class to make life easier wrt verifying arguments, so I've included that one as well.

Posted Monday, May 18, 2009 10:52 AM by FransBouma | 9 comment(s)

More Posts Next page »