January 2008 - Posts
(Updated Wednesday 30-jan-2008). It was mentioned that we would implement 'Skip' as well, although we already had a paging method added, TakePage(). After carefull analysis, we decided not to implement Skip for now. The reason is that it can lead to confusing queries, while paging is what the developer wants. We believe TakePage() serves the developer better than a combination of Skip / Take (Take is still supported separately) which won't work in a lot of cases if Skip is specified alone.
(This is part of an on-going series of articles, started here)
Cast again
The last episode in this series contained a remark about Queryable.Cast, and that it can be ignored. I've to correct myself there, this isn't entirely correct. Let's look at the example query at hand:
// LLBLGen Pro Linq example
var q = from e in metaData.Employee.Where(e => e is BoardMemberEntity).Cast<BoardMemberEntity>()
select e.CompanyCarId;
Here, the Cast is actually irrelevant because the Where already filters on the BoardMemberEntity type. However in the following small query, it's not:
var q = from e in metaData.Employee.Cast<BoardMemberEntity>()
select e.CompanyCarId;
Here, the Cast is actually a type filter. Well... sort of. The thing is: there's no real 'good' way to handle this, similar to 'as', which is discussed below. Imagine the situation where the employee instance is of type ManagerEntity (supertype of BoardMemberEntity), and not of type BoardMemberEntity. What should the query above do in that case? The instance 'e' in that case doesn't have a CompanyCarId field, as that's a field only available in the BoardMemberEntity type. The specification of Queryable.Cast doesn't reveal any help here: it doesn't say what should be done when the Cast can't be performed. So, I decided it emits a type filter instead. This means that the query above always succeeds if there are BoardMemberEntity instances in the database, it simply filters out any types which aren't of the specified type, in this case BoardMemberEntity.
OfType
Queryable.OfType is a bit of a weird method, in that it actually does more or less what Cast does in that it filters out any elements which aren't of the type specified. Of course, 'Cast' by definition isn't a filter, but in database queries, you have little choice here: SQL isn't imperative, at least not in the way SELECT queries work. OfType therefore is implemented similar to Cast: it emits a type filter into the query.
The 'as' keyword
I already had implemented support for 'is', which resulted also in a typefilter (if you now have the feeling there are a couple of redundant ways to specify the same thing in Linq, you're correct), but 'as' is a little different. It again is more or less a keyword which is actually not really usable in database queries. Let's look at a query which illustrates this:
var q = from e in metaData.Employee
where (e as BoardMemberEntity).CompanyCarId==1
select e;
This in itself looks pretty straight forward, but look closely to what it represents: what if 'e' is a ClerkEntity (not a supertype of BoardMemberEntity) ? The 'as' keyword should then result in null/Nothing, so accessing the property CompanyCarId on it should result in an exception, at least in .NET code it does. But how are you going to convert this into SQL? One could argue, and I agree with that, that the query above is pretty poor code and constructions like that should be discouraged. Though the thing is that just because it's possible, it will cause some people to write code just like that.
Fortunately, the code above is translatable to SQL in general form, if the 'as' keyword is seen as a type conversion, so 'e' is simply seen as a BoardMemberEntity and the fact that a 'null' can occur is ignored as the filter on CompanyCarId will weed out any types without that field in the query anyway (no matter what inheritance mapping strategy is used). The only difference is that if you expect an error, you won't get one. Let's look at a more nastier version of the one above which actually is more correct:
var q = from e in metaData.Employee
let x = e as BoardMemberEntity
where (x!=null) && (x.CompanyCarId == 1)
select x;
Here, across two statements, a type filter is written: first the 'as' conversion in the let statement, and then the test for null on the result. You can generally write this as a handling of type specification equal/not equal null and by adding a special handler for that in the general binary expression handler, you can convert that situation to a type filter!
There's something not really great about the query above though. Linq to Sql has no problems, because it only supports single-table inheritance. If the O/R mapper supports multi-table/view inheritance, like LLBLGen Pro does, you have a bit of a problem with field names. Consider a hierarchy where the root has three branches of subtypes, and in two branches a field Foo is present, though not in the third branch. If you now fetch all entities of the root type, you will end up with multiple times the same field in the projection. The fix for that is field aliasing, namely you give every field an artificial alias, e.g. Fx_n, where x is some index to specify the entity and n is the index of the field in the entity.
The problem begins when 'let' entered the room: a 'let' is effectively a wrapping SELECT statement. (You can rewrite some queries by moving the query to other parts of the query but that's not always possible). The query wraps the original query in its FROM clause as a Derived Table and simply selects from that source, making the projection required for let. Now, imagine you have your fields aliased with artificial aliases inside that source. Your Derived Table fields now have names like F0_1, F3_4 etc., but not 'CompanyCarId'. This means that the filter on CompanyCarId, which is outside the Derived Table, as let wrapped what's before 'let' in the query, will fail if it targets 'CompanyCarId'. Normally it would work, as the JOIN statements which joined every subtype's Table/View would be accessable for the WHERE clause, but in this case that's not possible because the joins are inside the Derived Table.
As elements are processed in different areas of the Linq provider (you're not going to write a big routine which handles every situation), the handler for Where has no clue if the field should be re-aliased, at least not at that point. So I wrote a traverser, a class which crawls over all LLBLGen Pro query api objects and finds the elements sought. Basicly it's a base class which simply visits all object elements and if you in a derived class override a given method you can tap into this process and do what you need to do, e.g. collect Derived Table definitions, fields targeting derived tables etc. I wrote a couple of derived classes which collected information for me this way and with that I could easily fix every field alias and target in the entire query, so CompanyCarId references in filters outside the Derived Table would then be fixed to target the Fx_n field which represented CompanyCarId. The crawlers are also handy to find places where inheritance relations have to be injected into the query, for example because the query is folded into a subquery inside a Derived Table. The classes will be public in the upcoming runtime library of LLBLGen Pro v2.6 so our customers can use them as well for their own fine-grained query voodoo magic
.
Except and Intersect
Except and Intersect are two methods which are actually each other complement: you can implement both with a single construct and just a flag to add 'NOT' to the query fragment. Except and Intersect can be implemented as an EXISTS query construct, similar to the work done for Contains: a lot of code can be re-used for these two methods.
Except and Intersect can have an IEnumerable as argument. Although Linq to Sql doesn't support it (I don't know why) it's perfectly doable, and in Linq to LLBLGen Pro you can pass an in-memory set of elements to Except or Intersect. What's particularly weird with Linq to Sql not supporting it is that the code to support in-memory sets as argument for these two methods is also usable for Contains. Oh well...
.
There's some pitfall to be noticed with these two methods however. Consider this query:
// won't work
var q = from c in metaData.Customers
where someCollection.Except(c.Orders).First().EmployeeId==3
select c;
This query might look like a bit far-fetched, but the general idea is this: it's not possible to use an in-memory collection in a database-targeting query where Except is called on that in-memory collection with a database set as argument. Do you see why? It took me 3 days to realize this, so don't worry if you don't see it right away. The thing is: Except filters the set it is called on. But that set isn't in the database, you've to pass it to the database in the query. With PK fields, that might be possible, but not with complete entities, that's undoable. With Intersect it could be done though, however I haven't implemented that for now, as it's easy to work around it (swap the operands of Intersect
).
Single
Now here's some method I have no idea why it is in the API: Queryable.Single. Let me first quote the specification of the method:
Returns the only element of a sequence, and throws an exception if there is not exactly one element in the sequence.
There's an overload which accepts a predicate and which simply means that Single(predicate) will return the only element in the sequence which matches this predicate. What I find odd is the remark about throwing an exception: why would anyone call such a method? There is a rule about exceptions: "Don't ever use exceptions for control flow in your application". Exceptions aren't expressions for if/else constructs, they're serious business: they mean something was definitely wrong and needs handling to avoid a total crash. In database scenario's it's even weirder: which exception should you expect? And more importantly: what does it mean? If you don't know what the true meaning of an exception is, you'll never be able to handle it.
The example in the MSDN to illustrate Queryable.Single() is pretty bad actually, because it uses exceptions as control flow. Not only that, the example fails to illustrate a valid case where the exception would be something you would want. This is important, because... I can't think of a use case for Queryable.Single() where you would want the exception. After all, exceptions aren't meant for control flow, so I don't expect an exception to drive my code as if an if-statement resolved to false: it's not meant to be a test if a set contains more than one element.
The thing with Single is that it's redundant, at least for database queries. You can also use First(). First() also returns a single element, but it doesn't throw an exception when there are more than one element in the sequence. With database queries, using Single() has the same behavior as First(). Sure, I could add code which flags the resulting QueryExpression object that the result should be checked if it has more than one element and if so, an exception should be thrown, as Linq to Sql does with this query:
// Linq to Sql code
var q = (from c in nw.Customers select c).Single();
However, do I get the same exception that there are more than one element in the sequence with this one:
// Linq to Sql code
var q = from c in nw.Customers
where c.Orders.Single().EmployeeId==3
select c;
No. With this query I get a hardcore severity level 16 error from SqlServer that the query is wrong because the subquery returns more than one element which can't be used with the operator (=) specified. This isn't the fault of Linq to Sql, what else can it do? Call RAISERROR (someone at Sybase still feels ashamed about that typo I bet
) with a Count check? Why? Would that help the caller of the query? I'm not convinced it necessary will help. The 'Single' method is simply not useful in database queries: for the Single() overload, use First(), for the Single(predicate) overload use Where(predicate).First(). Though, as the requirement of the method is that it should throw an (not specified which one) exception, these usable synonym statements aren't completely covering what Single() represents. Though in my opinion, the exception requirement is a big mistake: if you need behavior to be called when a set has more than one element, you should test on that specification and call the behavior if the test succeeds, not by issuing a query which in the end fails with whatever error takes place, and count on that to handle things further.
I'll add support for it, though under protest. The reason I'll add support for it is to be compatible with queries which target other O/R mapper frameworks.
Linq to Sql issue(s)
During development of the Linq to LLBLGen Pro provider I often check what Linq to Sql does with a given query I use for testing and to see if for example my SQL is more efficient or falls flat on its face. Sometimes you run into unexpected things. When I tried an Except or Intersect query, I saw that 'DISTINCT' was emitted into the query. Everyone who knows SQL knows that DISTINCT is a keyword you have to be careful with, not all databases support every type of field with DISTINCT. In SqlServer for example, (n)text and image fields aren't supported in a DISTINCT projection. Not sure if this was a small glitch or that there was logic which would prevent DISTINCT in queries where it's not allowed I tried:
// Linq to Sql, gives crash with NotSupportedException.
var q = (from e in nw.Employees select e).Distinct();
It too emitted DISTINCT into the query, which was caught by its validation checker. However what's worse: Except and Intersect therefore also aren't usable with Linq to Sql and any image/(n)text containing entity type: DISTINCT is always emitted into the query. This query for example gives the same exception, though no DISTINCT was specified in the Linq query:
var q = from e in nw.Employees.Except(
from e in nw.Employees where e.Country == "USA" select e)
select e;
One could argue: "But if DISTINCT isn't possible, how to weed out the duplicates?". Well, you do that on the client in the routine which consumes the datareader and constructs the objects to return. You keep hashtables with hashes calculated from identifying attributes like PK fields and with that you filter out duplicates. At least with entity fetches like this one. Not supporting situations where DISTINCT can't be emitted into a SQL query is a typical error one could make in a v1 O/R mapper, it's only a bit sad for Linq to Sql users that Microsoft is so generous with releasing fixes for their framework on a regular basis.
Writing a Linq provider is a lot about true software engineering, I've written about that several times before: some things are known but the biggest part is unknown territory: what is constructed in which order, when to expect what, is it safe to ignore this or that? Unclear things one can only find out when it is used, i.e. by trial/error approaches. This is actually a bit sad, because it's now easy to overlook mistakes or miss corner cases, as the bigger picture isn't always clear. For example: what does VB.NET emit into the expression tree for string concatenations if C# emits 'Add' operations ?
With the DISTINCT keyword popping up in the Linq to Sql query for Except I was immediately alarmed: why is it there? Linq to Sql never emits DISTINCT when it's not told to do so. You then start thinking about it: is it something which is a left-over from their tree manipulations? Or is it hardcoded set to be emitted? It turns out it is. In QueryConverter.VisitExcept and QueryConverter.VisitIntersect it sets 'IsDistinct' to true. I couldn't think of a reason to have it default to DISTINCT, so I didn't add a requirement for that to our tree for Except and Intersect, also because it uses an EXISTS query, which doesn't care if DISTINCT is there or not.
Both routines are also a clone of eachother. Clones are easily created and often overlooked, however this one is particularly obvious, especially because the behavior of the two routines is closely related, so the implementation of the handlers is then too closely related.
What's next?
Implementing 'Single', probably tomorrow, then on to implementing the database function framework I have in mind: it has to be a framework where developers can add their own function mappings to the provider so they can map their own extension methods to database functions easily. When that's done, Queryable.Convert gets another look as some scenario's require it not to be stripped off but handled instead, though that relies on the function-mapping framework, so it has to wait till then. After that, hierarchical projections and prefetch path support are on the menu (prefetch paths are more a small addition as the core functionality is already in the runtime for quite some time, hierarchical projections require the prefetch path merge code already in the framework to be opened up to the Linq provider) and then I'm done with the Linq part. Finally. Stay tuned.
First of all: welcome. Now, as you all might know, this blog site, http://weblogs.asp.net, has a grouped RSS feed (a couple actually), which is called the 'main feed'. If you place your post in a category which is in the default list of this site, your post will automatically end up on the main feed. This is a nice feature, but as it is used now it kills the site.
At the moment, the feed is flooded with completely useless posts: posts which link to articles from 2 years ago, copied texts from manuals, very poor code copied from other websites etc. etc. More and more people are giving up on this main feed because of this. While what you post on your blog is your business, you have to realize that by placing the post on the main feed, you also affect others on this site. For example, I don't want people to unsubscribe from the main feed and I bet a lot of others here think the same.
So, dear new blogger at weblogs.asp.net, do yourself and us a big favor: write new content. We all know about a couple of search engines, and we all know how to lookup things in MSDN. We don't need random posts without any red line to point us to these articles. Instead, what we don't know (yet) is what your opinion is about topics related to .NET programming, what your experiences are with .NET programming, what your ideas are about how things could get better. Those articles are interesting to read and will attract new readers and existing subscribers to come back.
History learns that this flood of useless posts is temporary, but what's sad is that often people who unsubscribe to a feed don't come back.
(This is part of an on-going series of articles, started here)
In the previous post in this series, I mentioned that I had completed the work on all the major parts of a SELECT query. SELECT is what a Linq provider is all about (as Linq queries are focussed on fetching data, not manipulating it). It was sort of a milestone for me, it gave the feeling that most of the work was done. What could possibly be the work in the rest of the Queryable extension methods, compared to nasties like GroupJoin and GroupBy / Aggregates?
All / Any
So I started with the top of the list of extension methods of the Queryable class and started implementing the ones which weren't implemented / supported yet in the code. The first one, Aggregate, could be skipped as it's unclear to me till today how on earth to even use it. I couldn't produce a single compilable piece of code which could mimic for example an aggregate function like Sum, and neither could the people I asked, so I skipped it.
Next was All. All is the opposite of Any, implementation wise: All is equal to a NOT EXISTS(query on source with negated predicate operand).Any is equal to EXISTS(query on source with predicate operand). So you can write one routine to handle them both. As with all EXISTS queries, you've to make sure you fold in the correlation relation which ties the two queries (outer query and EXISTS query) together as a predicate in the subquery. LLBLGen Pro doesn't support boolean values in projections, as databases in general don't support booleans in projections, so All() and Any() without parameters as the final methods in the query aren't supported. Using All() and Any() inside a Where clause however is of course supported.
Cast
We already covered Average when aggregates were implemented so let's move on to Cast. Cast is a silly method more or less: it's there to make the query compilable. It's not really necessary in the query. Let's look at an example.
// LLBLGen Pro Linq example
var q = from e in metaData.Employee.Where(e => e is BoardMemberEntity).Cast<BoardMemberEntity>()
select e.CompanyCarId;
Employee, Manager and BoardMember are in an inheritance hierarchy, where each type is mapped onto its own table or view (the inheritance type Linq to Sql doesn't support). I know this particular set of types isn't worlds best example of explaining inheritance, but still it offers possibilities to test things out. BoardMember for example is the only entity which has a relation with CompanyCar. The query above fetches all CompanyCarId values from all BoardMember employees. The Where clause adds a type filter for the BoardMemberEntity type, so 'e' will only be a BoardMemberEntity. LLBLGen Pro supports inheritance since v1.0.2005.1 and has an easy facility to add a type filter so adding support for this wasn't hard. It comes down to implementing support for TypeBinaryExpression nodes in the tree and to make sure Cast worked, these TypeBinaryExpressions were required, so I added support for them first.
In the query above, the Cast call is actually not required, as the type filter on BoardMemberEntity already makes sure that the fetch will only return BoardMemberEntity instances (or instances of subtypes of BoardMemberEntity, if any). However, if you remove it, the code won't compile as Employee doesn't have a field 'CompanyCarId', as that's a field only present in its subtype, BoardMember. The Cast is therefore solely for the compiler and can be skipped completely. That is, if you've solved the polymorphic entity fetch problem elsewhere of course
.
Contains
LLBLGen Pro doesn't support UNION queries, so we can skip Concat. The main problem is that LLBLGen Pro doesn't work with Query objects, but with methods which perform work for you, based on the parameters passed in. Not having Union isn't a big problem, one can always work around this without much effort. The method which is way more interesting is Contains. For everyone writing a Linq provider: Contains will tie your hands to the keyboard for at least a week or so, so be sure you calculate this in.
At first it looks pretty easy, Queryable.Contains(operand) can't be that bad. However, in the expression tree, you'll see a call to 'Contains', and then have to check which 'Contains' it is: Queryable.Contains, IList.Contains, String.Contains, EntityCollection.Contains etc. Every one of them results in a different query or query fragment. As if that's not enough, 'operand' can be a query too, but also an entity already in scope, a constant, a field etc. Yes, Contains isn't something you'll implement on a Sunday afternoon.
Let me first show you two queries which illustrate another problem I ran into with Contains:
// Query 1
var q = from c in metaData.Customer
where new List<string>() { "USA", "UK" }.Contains(c.Country)
select c;
// Query 2
var q = from c in metaData.Customer
where someInMemoryList.Where(v=>v.StartsWith("A")).Contains(c.Country)
select c;
What's the problem? Well, to process the expression tree into a SQL query, we've to execute parts of it as if it was C# code! The first query initializes a
new List instance with two strings. This statement will appear as a ListInit in your expression tree, together with some friends to make things
complicated. The second query shows a combination of Linq to Objects with a Linq query which will be executed on the database. The Linq to Objects snippet,
namely someInMemoryList.Where(v=>v.StartsWith("A")) is a query on its own, it's not meant to be ran on the database, as it operates on an in-memory IEnumerable object. My personal opinion is that this kind of Linq voodoo should be avoided, because it's complicated to understand which parts of a Linq query will be ran on the database and which parts won't. But the world isn't perfect so there will be people who have to or want to create these kind of queries and thus support for this has to be added.
Funky Tizers
As this can become rather complex to handle, how to cut out the expression elements which have to be ran locally and evaluate them locally as well? Well,
it's actually quite simple: you have to traverse the tree and every expression which has parameters which aren't referring to another element is a
candidate. After these expressions are identified, you compile them using Expression.Compile with a trick: you make ()=>expression lambdas out of
them and compile these and run these by invoking them. The result of the invoke is the result of the expression. Microsoft has dubbed this Funcletization. In .NET 3.5 you can find an internal class which does the job for you more or less, however it's internal (Duh!, you thought MS would make things easy for you?
), and also it's best to check what it does and rewrite it yourself so it matches your needs: The last thing you need is that it finds false positives. You therefor have to tailor the process to the provider you're writing. Doing a Google search on 'Funcletize' will give you some info about it. It's a key component of any Linq provider targeting a database, as otherwise it will run the risk of crashing on simple in-memory evaluations inside the Linq query, or trying to convert Linq to Objects elements to SQL.
So after I had figured out how to convert all expression nodes which could be evaluated locally into expression objects which were processed internally outside the query, I could go back to Contains. Implementing Contains is rather complex because you have to deal with the situation where the source (the element Contains works on) can be a query but also the operand (the element which is checked to see if it's in the source). Contains is also a vital part of the Linq query system, so it's key that you implement all possible uses. I'll enlist a couple below so you can understand the complexity of Contains a bit
.
// Query 1, simple entity check in entity list
var q = from c in metaData.Customer
where c.Orders.Where(o=>o.EmployeeId==3).Contains(order)
select c;
// Query 2, operand is entity which is result of query
var q = from c in metaData.Customer
where c.Orders.Contains(
(from o in metaData.Order where o.EmployeeId == 2 select o).First())
select c;
// Query 3, operand and source are both queries.
var q = from c in metaData.Customer
where c.Orders.Where(o => o.EmployeeId == 2).Contains(
(from o in metaData.Order where o.EmployeeId == 2 select o).First())
select c;
// Query 4, constant compare with value from query. Yes this is different.
var q = from c in metaData.Customer
where c.Orders.Where(o => o.EmployeeId > 3).Select(o => o.ShipVia).Contains(2)
select c;
// Query 5, check if a constant tuple is in the result of a query
var q = from c in metaData.Customer
where c.Orders.Select(oc => new { EID = oc.EmployeeId, CID = oc.CustomerId }).Contains(
new { EID = (int?)1, CID = "CHOPS" })
select c;
// Query 6, idem as 5 but now compare with a tuple created with a query
var q = from c in metaData.Customer
where c.Orders.Select(oc => new { EID = oc.EmployeeId, CID = oc.CustomerId }).Contains(
(from o in metaData.Order where o.CustomerId == "CHOPS"
select new { EID = o.EmployeeId, CID = o.CustomerId }).First())
select c;
// Query 7, checking if the value of a field in an entity is in a list of constants
List<string> countries = new List<string>() { "USA", "UK" };
var q = from c in metaData.Customer
where countries.Contains(c.Country)
select c;
// Query 8, as 7 but now with an IEnumerable
LinkedList<string> countries = new LinkedList<string>(new string[] { "USA", "UK"});
var q = from c in metaData.Customer
where countries.Contains(c.Country)
select c;
// Query 9, combination of 2 queries where the first is merged with the second and
// only the second is executed. (this is one of the reasons why you have to write
// your own Funcletizer code.
var q1 = (from c in metaData.Customer
select c.Country).Distinct();
var q2 = from c in metaData.Customer
where q1.Contains(c.Country)
select c;
// Query 10, as 7 but now with an array obtained from another array.
string[][] countries = new string[1][] { new string[] { "USA", "UK" } };
var q = from c in metaData.Customer
where countries[0].Contains(c.Country)
select c;
// Query 11, the grand finale ;)
List<Pair<string, string>> countryCities = new List<Pair<string, string>>();
countryCities.Add(new Pair<string, string>("USA", "Portland"));
countryCities.Add(new Pair<string, string>("Brazil", "Sao Paulo"));
// now fetch all customers which have a tuple of country/city in the list of countryCities.
var q = from c in metaData.Customer
where countryCities.Contains((from c2 in metaData.Customer
where c2.CustomerId == c.CustomerId
select new Pair<string, string>()
{ Value1 = c2.Country, Value2 = c2.City }).First())
select c;
And then I skipped a couple as well, like string-based Contains calls. As String.Contains() required handling a Like predicate, I also implemented String.StartsWith and String.EndsWith in one go. The rest of the string methods will be covered by the DB Function mapping system we'll introduce, and which will allow developers to create their own extension methods and map them onto a DB function. String.Contains/String.StartsWith and String.EndsWith can for now only handle constant strings, similar to Linq to Sql, but perhaps if customers require it, support for operands which are queries will be added as well.
Now that this is done, we can look at the next method to implement, which will be ElementAt(n). Interestingly, Linq to Sql doesn't support ElementAt(n), while it just comes down to Skip(n-1).Take(1), or to use our own code, TakePage(n, 1). It's interesting because Microsoft did go to great lengths to make Linq to Sql be able to handle whatever query you throw at it, however that they don't support ElementAt(n) suggests that it actually is a problematic method, but it's at this point unclear to me why. But there's only one way to find out
.
DISCLAIMER: this is a bitter post. If you get offended by this post, I'm sorry, though I had to write this. If you want to leave a comment, please do so, but as it's my blog, I'll remove comments which I think are inappropriate
The last month or so I've been on the 'alt.net' mailinglist which later became 'lists' as the original one, altnetconf, was renamed to cli_dev (for whatever reason) and some group started a new one, altdotnet. I can't say I had a good time. In fact it was more or less a pain in a lot of occasions. I don't mind a sharp debate, and we all can agree on everything, but from the start I've never felt to be accepted as a person who has a valuable opinion. At least not by some more hardcore Agile/XP/TDD people.
As you probably have understood, I left both of the lists and won't come back. Earlier this week I decided to leave the cli_dev list, after a painful thread about the usefullness of comments in code. It's an old debate, so from the start everyone participating would know that it would likely last forever and the basic dumb arguments would be placed on the table. What made it so painful was that it never became a discussion between professionals. There wasn't any serious debate which could all bring us forward. Instead it ended with bickering why some of the posted examples (one of them was mine) were bad, and more importantly: they were used to bash the code and with that the programmer. I've spend almost two decades in newsgroups and mailinglists now so these kind of debates aren't new to me, but as I'm not a teenager anymore (far from that
), I started to wonder why on earth was this even happening.
The point of such a mailinglist falls apart if debates aren't used to get things forward, to express ideas and learn from other people's arguments why they've chosen their point of view. A lot of the discussions weren't about debating arguments, but hammering down opponents, as if there were achievements to gain from doing so. The cli_dev list particulary was a big pain because of the endless debates about what 'Alt.net' meant, why or why not there should be a manifesto, who had the right to call the shots of what would be included in such a manifesto etc. etc. I think the comment debate was the last push I needed to call it a day on that list.
The other list, altdotnet, was OK for a while, however it turned out that the group of people participating in the discussions actually started to look like the same group who did the debates on cli_dev. The end result was that the discussions became more and more painful. The reason was that you had to be very careful with your formulation to avoid awaking the TDD/Agile people who simply want to b*tch about anything that's not TDD/Agile.
This morning I made a mistake: I used the W word in a post. What's the W word? The W word is Waterfall. The W word makes Agile pundits go blind, at least for a couple of minutes. It will make them go blind for any words you might have written after using the W word. What did I do? I used the W word in a short silly overexxagerated-to-get-it-across-example:
Every movement/pressure to go into a given direction in general causes a reaction among people who disagree: they will try to go into a different direction. Waterfall, while being great for systems which must not fail like your MRI scanner or wafersteppermachine or your satellite robot, it sucks when you have to deal with clients who change their minds 10 times a day and who sold their tiny brain at e-bay last year.
A silly example, but I made an error: I stated that Waterfall was great for a situation. I meant it as an example of how one movement can cause the creation of a counter-movement. This is sociology 101. All of a sudden, I was back in the seat meant for the Waterfall lovers and I found myself defending Waterfall, something I don't want to do, because I don't like the methodology that much (read: I in general don't see a lot of cases where it can be successful)
I don't know why some people keep on pushing the 'Bouma likes Waterfall' message, while it's not true. Perhaps it serves their agenda. This latest 'debate' this morning made me realize: "What on earth am I doing here? Why am I writing posts on a mailinglists which has people who like to offend me?" and decided to leave the altdotnet mailinglist as well. It was a sad ending for me, as I had great expectations when I joined the lists a month ago.
I did learn something though. What surprised me to no end was the total lack of any reference/debate about computer science research, papers etc. except perhaps pre/post conditions but only in the form of spec#, not in the form of CS research. Almost all the debates focused on tools and their direct techniques, not the computer science behind them. In general asking 'Why' wasn't answered with: "research has shown that..." but with replies which were pointing at techniques, tools and patterns, not the reasoning behind these tools, techniques and patterns. Answering Why with pointing to techniques, tools and patterns is creating a cyclic debate: the Why question is asked to understand the reasoning why some tools/techniques/patterns/practises are used. Pointing back at tools/techniques/patterns/practises isn't going to make you any wiser, as you then only learn tricks, because you can't put any argument on the table why you use pattern X, use technique Y and practise Z.
There's nothing wrong with applying tool T or pattern P to solve a problem you're facing. However if someone asks you why T or P are used, you should be able to answer that question with solid arguments, and not with "T was shiny, open source and l33t and P is used by everyone else in the group".
For example: if you say "You should use the principle of Separation of Concerns (SoC)", what exactly do you mean? It's not as obvious as it sounds. Please read the paper N Degrees of Separation: Multi-Dimensional Separation of Concerns by Tarr, Ossher, Harrison. It's an example how some 'technique' isn't as general usable as you might think: one has to think it through.
That kind of debates weren't on the alt.net mailinglists, they stalled on the level of 'thou shall use SoC", but what it meant exactly wasn't really worked out, as it was apparently assumed that everyone would know what it means. Referring to the paper didn't help, no-one picked it up. This is just an example of what I noticed.
I do think that it's important. I'm not totally finished with how I would call it, but the process looks like some years ago a separate software engineering world has been created (not only on .net, it's much wider) which isn't connected with the computer science world, but instead is focused inwards, looking for answers in the own world instead of in the computer science world: ideas aren't driven by science and research, ideas are driven by what you can do with a tool, with a technique, a pattern. The consequence is that the result of working on that idea has its root in the tool, technique or pattern, not in fundamental research in computer science. This result causes other ideas, which causes other results etc. etc.
Is this bad? I don't know, but if these two worlds drift apart, and the more I think about it, the more this is going on already, the consequences could be severe: the world in which research is taking place isn't feeding the world which applies techniques/tools/patterns with fundamentals, that world of techniques/tools/patterns is feeding themselves with 'fundamentals', which is actually cyclic reasoning: the fundamentals aren't fundamentals, they were the results of applying fundamentals, which doesn't make them fundamental per se.
With this insight, I also understood all of a sudden why there is even an 'alt.net': it's simply a movement which wants to use a different set of techniques/patterns/tools/practises than Microsoft prescribes to its customers via its product catalog. Leaving the lists behind with that in the back of my head wasn't such a loss after all: it's the science which counts, not the technique/tool/pattern/practise. Always keep asking Why, and search for fundamental answers. Only then you'll gain wisdom, instead of just knowledge.
I wish everyone a wonderful 2008!
More Posts