Frans Bouma's blog

Generator.CreateCoolTool();

Syndication

News




    Add to Technorati Favorites

About me

Fun stuff I created

My work

June 2007 - Posts

Maintainable software: why you can't live without proper solid documentation

This post is a reply to Jeremy D. Miller's post 'A Train of Thought, June 17, 2007'. It's part of an on going discussion about maintainability of software and what's necessary for having maintainable software. I'm not going to link to every post in the discussion, you can find most of them from Jeremy's post.

Before I continue, I'd like to say that I'm not participating in this discussion to disqualify TDD/Agile as a set of useful methodologies because I do think they have some solid points everyone can benefit from. I'm also not participating in this discussion because I'm a waterfall-follower, because I'm not a waterfall follower. Waterfall is a methodology which could be very beneficial but it has to suit the project. For example, you really want to use waterfall in software for some medical equipment as you don't want to run the risk to miss a spot because you didn't anticipate a particular use-case would be possible. I don't use waterfall myself as I'm not in the medical equipment business and I'm also not a consultant payed by the hour. But more on that later on in the article. The post is build up as replies to things Jeremy said in his post, so the blockquotes are quoted from his post.

The summary comes down to this: Documentation describes the what and the why, code describes the how. You need both documentation and code to have the complete overview, not just the code.

Granted, I've got an almost knee jerk reaction to disagree with Frans on almost anything related to software development, but I'd still prefer a much stronger emphasis on the "what" and "how" a system is put together than I would the design documentation.

And why is it that you want to disagree with me about software development so often, Jeremy? Is it because you think I am a true waterfall-adept and anti-agile/TDD ? Well, I'm not.

Focussing on the what and how is OK, and I'm not saying that solid, clear, easy to understand code isn't more maintainable than a big steaming pile of spagetti-crap, but just focussing on getting solid clear easy to understand code doesn't necessarily bring you a great maintenance experience: if essential information is missing, you're still doomed. Furthermore, it's not to say that proper code is thus the result of TDD/Agile principles. It's just that your experiences show that Agile/TDD gives good easy to understand code. Well, good for you . The thing is though: a team of good software engineers which works like a nicely oiled machinery will very likely create proper code which is easy to understand, despite the methodology used. If all of those software engineers move on to other projects, you could have shared as much knowledge inside the team, but that's not going to be available to the successors unless you provide solid, easy to understand documentation about the code.

I don't really care so much "why" it was written that way, only what it is. And by solidly written code I mean code that I can understand by looking at that readily accepts change.

This is the essential part where you make a trivial, but costly mistake: The why is of up-most importancy. The reason is that because you have to make a change to a piece code, you might be tempted to refactor the code a bit to a form which was rejected earlier because for example of bad side-effects for other parts. If you don't know the why of a given routine, class or structure, you will sooner or later make the mistake to refactor the code so it reflects what wasn't the best option and you'll find that out the hard way, losing precious time you could have avoided.

That's why the why documentation is so important: the documented design decisions: "what were the alternatives? why were these rejected?" This is essential information for maintainability as a maintainer needs that info to properly refactor the code to a form which doesn't fall into a form which was rejected.

The other element of your remark, about understanding code, is showing some lack of understanding why humans are so bad in writing code: you assume you will understand code when you read it. Well, I have news for you, Jeremy: you will not. Not now, not ever. And not only you, but everyone out there who writes code, thus that includes me as well, will not be able to read code and understand it immediately. That's not because you lack experience or knowledge, but because you and I are human. Sure there will be code snippets we will understand in a heartbeat. However there's an essential part of understanding code which is missing in a human body: a code interpreter which can understand why at time T element A has the state S and why at time T+t it has state S'. Only with such a code interpreter you'll understand what code does in full. As a human lacks such an interpreter, we can only try to understand the code and we will very easily make mistakes doing so. That's also why there are tools like Resharper and everybody's friend, the debugger.

Besides, I've never seen a long technical document that was entirely useful. Time and manpower is finite. I'd rather sink more energy and resources into better, cleaner, well-structured code than comprehensive documentation because I think the payoff is higher. To me, one of the biggest advantages of moving from a waterfall shop that produced a lot of intermediate documents to XP shops was that I now get to spend much more time on a project focusing on the design, architecture, and code than I did when I was on the hook for much more documentation. I write fewer documents, but I get to create better code with far better configuration management practices. I call that a net win.

Let me be blunt here: do you hate documentation that much? Do you think having a lot of documentation hurts your project and will make it, oh behold! look like it is written using waterfall? Code isn't documentation, it's code. Code is the purest form of the executable functionality you have to provide as it is the form of the functionality that actually gets executed, however it's not the best form to illustrate why the functionality is constructed in the way it is constructed. I'll get more into that in a second.

Oh, and besides that: just because you haven't seen a technical document which makes sense doesn't mean having them is effectively useless. I've seen technical documents which did make a lot of sense and were essential to understanding what was going on at such a level that making changes was easy.

You see, Jeremy, the thing is that if a set of features has to be added to a project that is in production for a while, you really need overview where to make the changes and in what form. If your project consists of say 400,000 lines of code, it's not a walk in the park to even get a slightest overview where what is located without reading all of those lines if there's no documentation which is of any value. Code is for formulating functionality in an executable form, it's not documentation of any kind. If you think that it is, I really pity the one who has to maintain your code in say, 2 years from now.

As a quick aside, Frans also more or less makes the claim on Sam's blog that TDD doesn't do anything for maintainability, or just wonders what in the world it does do for maintainability.

No I didn't make that claim, you think I made that claim. What I was trying to say was that TDD/Agile is advocated as a set of methodologies which will make your project the best that can be written. However, the elements for properly maintainable software don't require TDD at all! Furthermore, you seem to suggest that TDD/Agile will give better results no matter what, which isn't guaranteed: it depends on the people in your team and a lot of other factors if the results of your software project will be up to par. TDD/Agile can help, but aren't a guarantee. They're also not a ticket for maintainable software.

Orthogonality. Codebase's developed with Test Driven Development will almost always exhibit better qualities in regards to cohesion and coupling, the very same qualities that make code easier and safer to change. I know Frans is going to come back and argue that he gets it done with lots and lots of documentation and very careful upfront design. I'm going to respond to that by saying the instant feedback loop from doing detailed design with TDD pushes me in the direction of orthogonality more efficiently and effectively than any form of upfront design. Why is this true? Because you can't use TDD on code that isn't loosely coupled and not easily on code that isn't highly cohesive.

Ah, so now you also know how I design my software, Jeremy? Don't you agree that what you said above is actually pretty stupid? Especially when I say to you that I've used TDD/Agile style development for the last 5 years now? The code base of LLBLGen Pro alone is massive: the designer gathers meta-data which is fed to a code generator stack executed by tasks in a queue which consume templates which produce code which is a specialization of compiled code in the runtime libraries. Meta data affects generated code affects the total class stack in the project and vice versa. It's pretty complex, if I may say so. Do you think I've designed that all up-front in a waterfall-esk way, spend months and months writing document after document and then started with writing a lot of code? No way! It's vertically developed, use case after use case. Every feature is seen as a use case or set of use cases, depending on the feature, first analysed what the feature embodies, what impact it might have etc. etc., if necessary tests are written up front. Ok, then I'll do a weird thing: I'll open the design document and will write a piece of documentation how the feature is designed and why particular parts are the way they are and why alternatives won't work. After that, I'll go into the code base and write the code for that feature, run the tests I've written before and update documentation if I was wrong what I wrote there.

You see, documentation isn't a separate entity of the code written: it describes in a certain DSL (i.e human readable and understandable language) what the functionality is all about; the code will do so in another DSL (e.g. C#). Thats the essential part: you have to provide functionality in an executable form. Code is such a form, but it's arcane to read and understand for a human (or is your code always 100% bugfree when you've written a routine? I seriously doubt it, no-one is that good), however proper documentation which describes what the code realizes is another. These two aren't separate entities and you can't write the documentation after you're done writing code, as you then will document how the code works. Which is nice, but not enough. You need the why part too.

I'm very very glad that I've written these documentation parts since the beginning of the project back in 2002. You see, I've written the majority of the system and if someone should know how all code works, it would be me, right? Well, perhaps I'm not as talented as you, Jeremy, but I'm not able to remember every design decision I made in detail for this massive code base. Also, if something happens to me, I really want to hand it to someone else so s/he can continue my work. With documentation that's written on the spot, you can. When I need to make a change and need to know why a routine is the way it is, I look up the design document element for that part and check why it is the way it is and which alternatives are rejected and why. After 5 years, your own code also becomes legacy code. Do you still maintain code you've written 2-3 years ago? If so, do you still know why you designed it the way it is designed and also will always avoid to re-consider alternatives you rejected back then because they wouldn't lead to the right solution? Without proper documentation you can't possibly avoid missteps you probably already made before.

The unit tests are a form of documentation. Reading the unit tests for a class should be a great way to learn how to use any given class. I can think of several cases where someone else's unit tests have made it easier to use a class or API.

Unit tests are tests. They test a given piece of functionalty written in code to see if that code indeed represents that functionality. Very valuable feature and an essential part of quality assurance. What's missing is that a unit test isn't documenting anything: code isn't documentation, it's code. It describes the same functionality but in such a different DSL that a human isn't helped by wading through thousands and thousands of unit tests to understand what the api does and why. The why will never be represented by unit tests, the unit tests will only show how in a particular situation you can use a given routine or class. Use tests to see what you think is OK, is actually OK. Use documentation for documenting what you've written in code. Unit tests also don't reveal why the inner workings of methods / classes are the way they are. They just confirm that they work in the particular case the unit test tests for.

Using unit tests for learning purposes or documentation is similar to learning how databases work, what relational theory is, what set theory is etc. by looking at a lot of SQL queries. You will only see a lot of SQL, there's no context, there's no explanation why the statement is written that way and not in another way. Wouldn't you agree that learning how databases work is better done by reading a book about the theory behind databases, relational theory, set theory and why SQL is a set-oriented language? Then why is it so odd that in the case of the theory behind a piece of software you've written, it's OK to fall back on the code which uses it in a limited set of situations?

High levels of unit test coverage gives you so much more ability to change existing code without introducing regression errors. No matter how much upfront analysis and design you try to do, the users will always come with something completely new that you couldn't reasonable anticipate in your initial construction. It's awfully nice to have that immediate safety net of focused unit tests as you make changes to existing code. Documents are passive. Unit tests will shout out when they're broken -- assuming anybody runs them of course. Good unit tests will even tell you exactly where the regression breaks happen.

Unit tests are valuable, there's no disagreement there. However their name already implies that they're not documentation, they're tests. Documentation also isn't passive. It's active, as it describes in another DSL what functionality is implemented and why it is implemented the way it is implemented. If I may, I'd like to describe documentation as a projection result of the functionality to deliver onto human readable and understandable text and code as the projection result of that same functionality to deliver onto machine executable elements.

This implies that if the functionality changes, documentation and code will change, not just the code, simply because the documentation is the projection result of the same source as the code is.

Or are you suggesting that the code you're writing is actually a result of whatever came up in your mind at that time and some test will tell you if that thought was actually acceptable or not? I doubt it. You're a professional, passionate about computer science, however never forget, Jeremy: so am I.

Posted Monday, June 18, 2007 11:59 AM by FransBouma | 21 comment(s)

Don't use foreach over MatchCollection, use for. UPDATED

UPDATE. Apparently they both call GetMatch(). So my advice isn't correct. Thanks 'Reflector' for the comment. What surprises me though is that first my routine (checked with Ants profiler) was slow because of the foreach, and now it's not.

Simple performance tip. Consider this code:

MatchCollection matches = myRegExp.Matches(someString);
foreach(Match m in matches)
{
    // your code which uses the match

This will perform awful. The reason is that the enumerator in MatchCollection executes the regexp again. This is, I guess, because you then can enumerate over the matchcollection without calling Matches. Instead do:

MatchCollection matches = myRegExp.Matches(someString);
for(int i=0;i<matches.Count;i++)
{
    Match m = matches[i];
    // your code which uses the match

Posted Thursday, June 07, 2007 4:45 PM by FransBouma | 5 comment(s)

Filed under: ,

SqlServer 2008: Does it or doesn't it have the Entity Framework?

I just read an interesting post on the Oakleafblog of Roger Jennings. There, Roger enlists his feedback he would have given to the ADO.NET team. It's an interesting list of items, most of them I can agree on. Though the better gems are in the comment posted by Mike Pizzo, ADO.NET's architect.

For example, Roger said:

3. Recommit to Mike Pizzo's fervent promise of RTM with a graphical EDM Designer in the first half of 2008, not "in the Katmai timeframe."

To which Mike replied (emphasis mine):

Orcas, Katmai, and the Entity Framework are all being released as part of a single marketing wave. Apparently we get more when we announce things together, even if they don’t release at the same time. Whatever. Anyway, the Entity Framework has no dependencies on Katmai, and ships as an update to the .NET Framework, not as part of Katmai, so we’re not dependent on Katmai to ship.

Ok, this made me re-read it a couple of times. Did I read it correctly or did I misinterpret it? (probably ).

Another good one is this one:
Roger:

9. Provide read-only access to foreign key values.

Mike:

This is actually a feature I’m fighting to get into our final milestone for V1. Can you describe the scenarios where this is used? Do you need the ability to query on the foreign key value, or simply expose it on the domain object?

If I have an Order entity object in memory, if I want to obtain its CustomerID and I only have the ability to get that ID is by fetching the related Customer entity into memory, instead of doing Order.CustomerID, I lose performance which is unnecessary. It's simple data-access stuff and it puzzles me a bit why Mike has to put up a fight to get this into their framework in the first place. Especially since the underlying object context knows that Order.CustomerID as it has to save it to the DB if you change the Customer related to that particular Order instance.

Roger also mentioned this:

2. Provide a similar white paper comparing the projected feature set of EF v1.0 and, perhaps, EF vNext with that of NHibernate, NPersist, and commercial OR/M tools such as LLBLGen Pro, Wilson ORMapper for .NET, et al.

Mike askes for volunteers. Well, as the O/R mapper I've worked on for the past 5 years is on that list, I'm more than willing to help out with that list. . A good starter for which features people might be looking for is this Wiki page over at c2.com.

Posted Thursday, June 07, 2007 12:48 PM by FransBouma | 4 comment(s)

SqlServer 2005 paging: there IS a generic wrapper query possible

(the Name field in the queries below is without [ and ] brackets, because CS currently goes bezerk because of these. Don't know why, but apparently a glitch somewhere.)

Recently, I wrote a blogpost about SqlServer 2005 paging, called API's and production code shouldn't be designed by scientists, about how horrible the paging syntaxis is in SqlServer, while it's easy in competing RDBMS-s. One thing that stood out was that it was apparently impossible to produce a wrapper query in SqlServer 2005 which would be able to page any other query you would write, while it was possible to write such a query in Oracle or DB2.

Today I was updating our own SqlServer 2005 paging code as the Sql generator had to revert to a temp-table approach if it ran into one or more 1:n relations. The reason was that the SELECT statement which does the actual query can't use DISTINCT, as the ROW_NUMBER() value is also in the select list, so DISTINCT has no real value: all rows are unique due to their value for the row number. To solve this, you could use a dual SELECT, first select the set you want to page on, then select that again but apply the ROW_NUMBER() on that set and then filter on the row number. This all sounds vague, so let's go to an example.

Query with single SELECT and combined ROW_NUMBER usage. This query uses a TOP clause to limit the resultset to fetch. It has the ORDER BY placed inside the OVER() clause. This query fails, as it simply returns a set of duplicate rows. It's usable on SqlServer 2005's AdventureWorks catalog.

WITH __actualSet AS
(
    SELECT DISTINCT TOP 9 [Product].[ProductID], 
    Name, [ProductNumber], [MakeFlag], 
    [FinishedGoodsFlag], [Color], [SafetyStockLevel], 
    [ReorderPoint], [StandardCost], [ListPrice], 
    [Size], [SizeUnitMeasureCode], 
    [WeightUnitMeasureCode], [Weight],
    [DaysToManufacture], [ProductLine], [Class], 
    [Style], [ProductSubcategoryID], [ProductModelID], 
    [SellStartDate], [SellEndDate], [DiscontinuedDate], 
    [rowguid], [Product].[ModifiedDate], 
    ROW_NUMBER() OVER(ORDER BY [Product].[ProductID] ASC) AS __rowcnt
    FROM
    [Production].[Product] INNER JOIN [Purchasing].[PurchaseOrderDetail]
    ON [Product].[ProductID]=[Purchasing].[PurchaseOrderDetail].[ProductID]
    WHERE
    [Purchasing].[PurchaseOrderDetail].[OrderQty] = 60
)
SELECT * FROM __actualSet WHERE [__rowcnt] > 4 AND [__rowcnt] <= 8
ORDER BY [__rowcnt] ASC

This query is a paging version of this query:
SELECT DISTINCT [Product].[ProductID], Name, 
    [ProductNumber], [MakeFlag], [FinishedGoodsFlag], 
    [Color], [SafetyStockLevel], [ReorderPoint], 
    [StandardCost], [ListPrice], [Size], 
    [SizeUnitMeasureCode], [WeightUnitMeasureCode], [Weight],
    [DaysToManufacture], [ProductLine], [Class], 
    [Style], [ProductSubcategoryID], [ProductModelID], 
    [SellStartDate], [SellEndDate], [DiscontinuedDate], 
    [rowguid], [Product].[ModifiedDate]
FROM
    [Production].[Product] INNER JOIN [Purchasing].[PurchaseOrderDetail]
    ON [Product].[ProductID]=[Purchasing].[PurchaseOrderDetail].[ProductID]
WHERE
    [Purchasing].[PurchaseOrderDetail].[OrderQty] = 60
ORDER BY [Product].[ProductID] ASC

The paging query is what most o/r mappers, including ours, would generate (or thereabout), except of course the hard-coded values which would be parameters, but you get the idea. Some minor details differ here and there between O/R mappers, but the idea is the same overall.

However, a paging query which doesn't work is of course not what we want. Furthermore we want a generic wrapper, so we simply can feed it a query, any query, and page over it. Well, it seems that generic wrapper is possible. The key is that the ORDER BY clause of the original query shouldn't be used in the ROW_NUMBER's OVER clause, but should be left in the query itself and a wrapper SELECT statement should be used. The query above with the wrapper then looks like: (we use a trick to avoid OVER() to throw an exception, by simply passing a timestamp. This is a common way to work around the issue of having to specify a sort clause). We'll retrieve the same page, the second page with a size of 4:

WITH __actualSet AS 
( 
    SELECT *, 
        ROW_NUMBER() OVER (ORDER BY CURRENT_TIMESTAMP) AS __rowcnt 
    FROM 
    (
        SELECT DISTINCT TOP 9 [Product].[ProductID], Name, 
            [ProductNumber], [MakeFlag], [FinishedGoodsFlag], 
            [Color], [SafetyStockLevel], [ReorderPoint], 
            [StandardCost], [ListPrice], [Size], 
            [SizeUnitMeasureCode], [WeightUnitMeasureCode], 
            [Weight], [DaysToManufacture], [ProductLine], 
            [Class], [Style], [ProductSubcategoryID], [ProductModelID], 
            [SellStartDate], [SellEndDate], [DiscontinuedDate], 
            [rowguid], [Product].[ModifiedDate]
        FROM
            [Production].[Product] INNER JOIN [Purchasing].[PurchaseOrderDetail]
            ON [Product].[ProductID]=[Purchasing].[PurchaseOrderDetail].[ProductID]
        WHERE
            [Purchasing].[PurchaseOrderDetail].[OrderQty] = 60
        ORDER BY [Product].[ProductID] ASC
    ) AS _tmpSet
) 
SELECT * FROM __actualSet 
WHERE [__rowcnt] > 4 AND [__rowcnt] <= 8 
ORDER BY [__rowcnt] ASC

With this query, we can create a (more or less) generic wrapper query to page any other select statement, with one minor restriction: the SELECT statement to page has to have TOP specified, if it has an ORDER BY (which is logical, as paging over unsorted data is meaningless). This however brings the wrapper very close to a generic wrapper, the only thing that should be inserted is a TOP clause. The final wrapper looks like:

WITH __actualSet AS 
( 
    SELECT *, 
        ROW_NUMBER() OVER (ORDER BY CURRENT_TIMESTAMP) AS __rowcnt 
    FROM 
    (
        -- the query to page here
    ) AS _tmpSet
) 
SELECT * FROM __actualSet 
WHERE [__rowcnt] > @rownumberLastRowOfPreviousPage 
    AND [__rowcnt] <= @rownumberLastRowOfPage
ORDER BY [__rowcnt] ASC

Of course the WHERE clause with the __rowcnt, that's up to you. What's key is that the numbering of the rows starts with 1. I used this particular predicate expression but there are more possible, as long as you filter on the right rows.

To limit the # of rows in the WITH CTE, you can use a TOP clause with a given number. If you are in charge of generating the SQL query to page, place a TOP clause in the SELECT which limits the total number of rows to (pageSize * pageNumber), where pageNumber starts with 1. The advantage of this wrapper is that you don't have to mess with the ORDER BY clause being transfered to the OVER() clause and whatever expression rewriting that might need.

Happy paging!

Posted Tuesday, June 05, 2007 3:09 PM by FransBouma | 6 comment(s)

Posting something with [Name] in the text fails
I'm trying to post a new article about SqlServer paging but the query SQL I was posting contained the reference to a Name field, WITH [ and ]. This gives an error in CS 3.0 here. Just for reference :)

Posted Tuesday, June 05, 2007 2:45 PM by FransBouma | 3 comment(s)

My last post on THE soap

Ok, this will be my last post on the soap of TD.NET and MS (has anyone already called Hollywood? ). In the community there's some controversy starting to pop up here and there and I just want to make clear what my position is and will be. This to avoid getting pulled into any camp in this soap.

My sole motivation to step up and say something about the matter is based on the fact that I as a software engineer don't want to have my hands tied on my back just because a competitor doesn't like what I do and sends his lawyers. A software engineer should, within the boundaries of the law, be able to write the software that fits the problem to solve. If that solution isn't what a competitor would have liked to see, that's life, but that shouldn't matter. To me, what Microsoft is doing at the moment, is falling into that category: sending along the lawyers because they don't like what a competitor is doing. The core point is: should this be allowed as a normal practise or not. It doesn't matter if you like MS or find Jamie a great guy or not or hate his guts: it's about that core point. Imagine yourself behind the keyboard on monday, writing code. Do you want to be afraid that what you write at that moment could cause a lawyer attack? No, of course not. That's my core point and motivation.

Some people try to put words in my mouth as if I would propagate the authoring of software which violates license terms, or the authoring of cracks of copy protection. No, of course I don't propagate that, on the contrary. That's also why I put clearly in my posts: within the law. That's important: as long as a software engineer's code doesn't violate a law, there's nothing wrong with the code. Please realize that violating a license and its terms is violating a law. So staying within the boundaries of the law covers that too: if you agree to a license, you have to obey these terms. If you don't, you violate the license and therefore violate copyright laws.

Sure, it's fuzzy sometimes: a crack of a given copy protection mechanism isn't illegal on its own, it's just a program. However as soon as you run it, it is, as it then alters other people's code. So where to draw the line? It looks complicated but it's actually quite simple: as long as you don't violate a law when writing your code and your code doesn't violate a law as well, you're OK, whatever that code might be, why wouldn't you be OK, you and I write that kind of code every day. If your code uses public APIs, public documented code and simply utilizes what's there, even if you have to spend 2 weeks straight to write it, it's not illegal, as it doesn't violate any law. If it would: what is the difference with other code which also uses public APIs, public documented code and simply utilizes what's there? IMHO nothing.

This whole mess wouldn't have happened if MS would have closed the hole Jamie used in their public APIs. It also wouldn't have happened if the usage of their toolkit would clearly state that you can't extend VS.NET express, not manually, nor via external means: he then wouldn't have been able to test his work. Sure, some people claim that MS intended VS.NET express not to be extended and that we all knew that that was the case. I fully agree with that. I even did assume that VS.NET express wasn't extensible at all, because MS said it wasn't extensible through add-ins. Apparently they missed a big spot and left a big chunk of their API ready to be used in the tool so it was extensible and with legally normal code. The code might go against what some company thinks is OK, but that's of course irrelevant: every ISV thinks the competition does things which aren't what they'd like to see, these competitors create a competing product .

Is creating an add-in for VS.NET express then 'unethical', because it goes against the spirit and intentions of VS.NET express? I can only say 'yes' to that. However let me tell you, dear reader, a small remark: ethics have nothing to do with this. Because it is also unethical to bundle a free competitor to commercial offerings in a bigger, also free product. It is also unethical to add functionality to your free IDE for your own database and block the competition's add-ins, while your free IDE is the de-facto standard for software engineers of your platform.

This to bring things a bit in perspective. I'm all for a world where business is fair, where ethics are a very valuable thing and everyone tries to do the very best for one another. The reality is however that things aren't that way in today's hard-core business world where one company's death is another one's breakfast. To the defenders of both sides: please keep that in mind as well.

This whole story about TD.NET and MS will likely end up in tears for both or one side. I have to admit that if I look at the case from a professional software engineering's perspective, I simply can't agree with MS' way of how they handled it. I also can understand how Jamie went the 'I have nothing to lose' route as his lawyer stated he has done nothing wrong (what else can you do in that case?). As a human being, I also have to admit that what Jamie did wasn't matching with the intentions of what MS had with VS.NET express: a crippled version of the professional tools. Though I then also have to admit that in the real world of hard cash and business, no-one gives a rats a** about that and finding MS brining up the point of ethics makes it rather, sorry to say it, ironic and funny.

As a software engineer, I would have handled it from MS' position very differently: close the hole, make it impossible to achieve what Jamie did, also re-word the EULA so it's clear what the intentions are with VS.NET express. Now it's too vague and 'You shouldn't work around technical limitations' also tells me as a user not to do things I simply have to do, as a bug or a design flaw is also a technical limitation. So don't write an XmlSchema import class library to make wsdl.exe recognize your own types with IXmlSerializable implementations, you software typer, you! (No VS.NET express user is allowed to do so, according to the EULA. Sorry, couldn't resist )

Apparently, MS will re-word the EULA soon. That's great news, so hopefully the vagueness will be removed and we all will know, license-wise what we're dealing with. I also do hope that MS will alter VS.NET express in such a way that you can't run add-ins, period. Not XNA studio, not popfly, not TD.NET, nothing. No add-ins means no add-ins. This will leave no room for a software engineer to turn the wrong corner, to use the wrong method: there's no method to use, there's no service to call, so the engineer has to conclude: 'it's not possible, what I need for building blocks isn't provided'. This will free the software engineer from looking into EULAs for every method s/he wants to use and it would free the software engineer from calling to lawfirms for every public published API they might utilize in their code.

I hope both players in the soap will take a step back, get off their high horses (didn't knew they'd make them this tall nowadays ) and talk about software again instead of business. Because, Microsoft also has to realize: the more they pack into VS.NET express Orcas, the more they'll lose their own argument that they're using now, because the more tools they pack into a free, de-facto standard IDE to kill off competition (e.g. a linq-to-sql designer to kill off any other O/R mapper out there to be sure everyone uses MS' offering so SqlServer is the RDBMS of choice, oh, just thinking out loud here), the more unethical it will become. As I said in my previous post, I don't mind their competition as I'm sure a lot of developers will make the right decision, what I'm against is unfair competition and uncertainty with the code I use and write. I hope MS will realize that you can't use the argument of unfairness and unethical against company A and using that same technique you accuse company A of yourself.

EOD (for me at least )

Posted Sunday, June 03, 2007 12:52 PM by FransBouma | 24 comment(s)

Thou Shall Not Work Around Technical Limitations! (whatever they are)


Dan Fernandez responded to my recent blogpost with a follow-up on the Jamie vs. Microsoft soap.

He used an analogy to try to make his point:

To paraphrase an analogy from that post, this would be comparable to a 3rd party company working around the technical limitations in the LLBGEN demo to unlock features in LLBGen Pro for free.

I'm glad you brought up this analogy, Dan, because it shows perfectly well how flawed the reasoning of MS is. First of all, what follows is nothing personal, I know you are a friendly guy and you just do what you have to do in the position you're in at Microsoft. What I'll write below is how I see it and which will hopefully give perspective to the effects on us, software engineers in the .NET world, so you'll realize what Microsoft is doing at the moment isn't as simple as you try to make it look like to be.
Let's say our demo is crippled (it isn't, it's just time limited, but for the sake of the argument, let's say it is) and lacks a certain feature, say 'save your project'. It appears to be compiled in the code but we simply didn't add a menu option to do so. So in effect, we created a 'technical limitation'.

If we then would have put in our EULA "you can't work around technical limitations" (exactly how MS has worded it), and a user would have created a plug-in for our designer (we have a plug-in system), and that plug-in would open a dialog with a single button, 'Save project', and by clicking it it would call the publicly available save routine, it is valid and legit.

You want to know why? Because: the developer wrote a legitimate plug-in and there's nowhere stated that a plug-in is a workaround for a technical limitation (which are also not stated!), and the plug-in is legitimately calling a public method in the API of the designer. This is possible because similar to VS.NET, our designer is build with multiple assemblies used in a single host process.

Semantically, you can then say: "Hey, that's not what we intended! We deliberately crippled our product and now someone works around that limitation we created!", but that's irrelevant: the road the developer took is legit: he didn't alter the assemblies of the designer, he didn't crack the designer's copy protection, he simply was smarter than us. Smarter in a couple of ways:

  • We didn't explicitly specify that using a plug-in wasn't allowed (MS also didn't)
  • We didn't explicitly specify the limitations that were forbidden to work around (MS also didn't)
  • The way the 'disabled' functionality was enabled is legit: no laws were broken to do so, no copyright infrigment was taken place
So, the developer did extra work to get what he wanted, but not by violating any law: he simply used what's available to him.

So, what's MS problem here? Nowhere did they state in their EULA that you're not allowed to create add-ins for VS.NET. They also didn't state in their EULA that you're not allowed to extend the provided IDE in any way or form. If a smart developer then finds a way to get some functionality working with the VS.NET IDE, using public APIs and public code, without altering assemblies, without patching code at runtime with cracked dlls, is that smart developer then in violation of anything?

The only thing that matters is the law. Is Jamie in violation of copyright laws? No, he didn't crack/alter anything. What he did and does and hopefully will do in the future is providing an alternative to Microsoft's expensive test-enabled versions of VS.NET for much less or even free. He used what's available to him provided by the tools MS provided. Big deal. Even if he would have used private, undocumented APIs it wouldn't matter: the functionality is installed by MS' own installer on his computer and ready to be used by the user. You know, Dan, developers use APIs to get things done. If you provide an API to get things done, what should the developer do? Not use the API? Nice eco-system!

Oh, and for the people who still think he violated an EULA please provide me answers to this:

  • Where is stated you can't extend VS.NET Express?
  • Where is stated what the technical limitations are
  • If he wrote the code in VS.NET Pro and compiles on the command line, is he still violating the EULA of vs.net express? If you answer YES: how is that possible, he never saw that eula, theoretically!

I first decided to stay out of this, but every day I get more and more angry about this. If MS wanted to stop add-ins in VS.NET express, make it impossible to run these and also state clearly in the EULA that running (!) add-ins isn't allowed. Still a bogus claim in court, but it would scare the cr*p out of a lot of potentially happy developers, something Microsoft is apparently trying to achieve. To me, it seems this is all about money: Microsoft provides an expensive tool chain for what Testdriven.NET also provides. If someone can use VS.NET express for free and use a cheap add-in to get the same functionality, use subversion for sourcecontrol etc., it will hurt their sales. Well, Dan, that's business. You as a Microsoft employee should know that word and its meaning through and through.

There's another reason why I'm more and more angry about this and really starting to get fed up by the stupidity of this: I own an ISV which produces software which next year will get a competitor from MS. I don't mind the competition, I think we have a strong product against their work, but what angers me are two things:

  • They dump a free competitor to commercial offerings into the market, which is in theory an unfair business practise as they bundle (sounds familiar?) their product with another bigger product. Still MS has the nerve to talk about 'ethos'.
  • A very vague, unclarified sentense in an EULA can start a legal threat against your company just because MS doesn't like your product in their market.

As an owner of an ISV, I simply can't ignore what's going on at the moment as it will affect my company as well, and also every other ISV out there: do we want to work in an environment where using a legitimate public API could cause a lawyer threat which will cost you a lot of money? No, I don't want to work in such an environment and I'm sure my fellow software engineers don't want to work in such an environment either.

Let's get back to your other remarks, Dan, because they're worth replying to as well: (emphasis are mine)

As you know, I have great personal respect and admiration for you, for MVPs, for RDs, and the entirety of the .NET developer ecosystem, and to be clear, this has nothing to do with community. As I mentioned in my post, what complicates this even further is that this isn’t a community developer doing this for his or her personal use or experimenting with our product, this is a *business* trying to sell a product that works around our technical limitations.

This has nothing to do with the community? Not directly, but it has everything to do with software engineering and what you can do with the tools provided as a software engineer and that fact alone affects everyone in this community.

Furthermore, it doesn't matter if this is a community member or IBM selling a big product in a market you're in too: it's about the way MS threats its own followers with lawyers and false accusations. Perhaps I'm not getting it, but I would be very happy as an ISV if a user created add-ons for a tool I would provide. You know why? Because it then would spawn a bigger eco-system of tools, more creativity, more users and in the end, more money in the bank. This guy wrote an add-in so express users can use unit-testing. That fact alone should be a signal that what you started: a free express edition of VS.NET, is working: it creates a bigger eco-system, it attracts more users etc. But what do you do instead? You burn a lot of goodwill and money to get rid of that add-in. That shows that you don't care about a bigger eco-system, more users etc.. You want to young developer to work with MS stuff? Make it so. However if that young developer has to be afraid to not violate an EULA or s/he will end up at the receiving end of letters from lawyers, what do you think that young developer will do, if that young developer also has the choice to pick Netbeans or Eclipse to start with Java and the thousands and thousands of add-ons and tools and libraries which are available to him?

Not about the community? Not about Business? You're funny, Dan.

So, Dan, I hope you're very happy with what you and your co-workers have achieved. There's a term for this: a Pyrrhic-victory. That is, IF you win at all in this. You shouldn't be that afraid that someone uses the work you provided. We're developers, we use the API's and other things to get things done. Perhaps not in the way you intended or want to. However that's totally irrelevant. If you don't want a developer to use a given API method, don't expose that method.

Posted Saturday, June 02, 2007 12:19 PM by FransBouma | 43 comment(s)

Look! Microsoft is working hard on building a community!


See it in action here.

Frankly, it doesn't really matter who's right: Jamie or the tie-with-suit (a real software engineer, you can obviously tell) at Microsoft who started this crusade. What matters is this: (the quote is from Ian Ringrose)

"Is it safe for me as a developer without a large legal department to work with Microsoft technology?"

I am more and more getting convinced: no, it's not safe.

Thanks Microsoft, for this big support of our .NET community and .NET tool eco-system. We really needed this positive PR....

Update: Especially read the long list of comments. Ignoring the obvious FOSS advocates, there are two things which are coming back: (also on other blogs, threads)

       
  1. Why isn't the extension mechanism in VS.NET express simply non-existing?
  2.    
  3. What's the exact EULA clause that's being violated?

Seems like two easy to answer questions, if I understand mr. Fernandez correctly. And 'ethos' ... ? You gotta be kidding me. This has nothing to do with 'ethos' or 'spirit', but with cold hard cash.

Posted Friday, June 01, 2007 9:38 PM by FransBouma | 22 comment(s)

More Posts