August 2006 - Posts
I seem to have stumble upon something here with my look under the hood of the .NET Framework an other tools. Many readers where surprised and fascinated by what you can actually do with quality assessment tools like Lattix LDM or a simple concept like DSM (Dependency Structure Matrix) with its easy to understand and scalable depiction of the dependencies within a software.
However, my posting also raised inevitable and justified concerns which I quickly want to address:
-Did I set out to do Microsoft bashing? Definitely no. I like the .NET Framework and many things Microsoft does quite a lot. But that does not hinder me to try to look behind their marketing veil or utter concern or even criticizm. It´s just that I use the .NET Framework a lot, so I wanted to take a peek inside. Whoever is proficient in Java should do the same for that platform. Also, since I advocate the publication of software quality measurements and will try to follow my own advice, I think it´s fair to help bring more clarity about for tools in wide use like the .NET Framework. I´m very open for any explanation Microsoft might give as to why mscorlib.dll seems to have so many cyclic dependencies and why this might be inevitable or even a good thing. If such an explanation exists, let me hear it. Otherwise I have to assume my questioning was very relevant.
-Why did I choose VistaDB as a second example? Because it happened to be on my laptop. And because it seems to be a fairly complicated piece of software being a full blown database server engine. That´s the only reason. I don´t own any shares of Vista Software :-) Please note, I also showed a DSM of Spring.Net - which also happens to be on my laptop. And to round of my little gallery here´s another tool that I happen to use: Vanatec´s O/R Mapper Open Access. What do you think of it´s architectural clarity?
Fig. 1: DSM of O/R Mapper Vanatec Open Access. (Click on picture to see a larger version.)
-Some reader´s assumed, the high coupling of mscorlib.dll was a general feature of the .NET Framework. But as I already tried to say in my original posting, that´s not really the case. Some parts of the .NET Framework seem to be much better structured. Here´s the WinForms subsystem:
Fig. 2: The WinForms subsystem is an example of a pretty well structured part of the .NET Framework
It looks quite neat and the main culprit for cyclic dependencies - again - is the "general bucket" root namespace Windows.Forms.
-Some readers also speculated if a general purpose library/framework like the .NET Framework somewhat necessarily has to be structured as it is. Well, there sure is some truth to this thought. A broad and basic framework like the BCL inevitably has a different structure than the next CRM system. But then... If cyclic dependencies are as fundamentally bad as software architecture has it, why should we exculpate Microsoft from them? Given they cannot be avoided completely in such a complicated piece of software like the .NET Framework, why are so many of them necessary? I´m concerned about the number of cycles, not the mere existence of any. And the more fundamental a piece of software is, the longer it needs to live, there more worried I´m about such signs of bad architecture.
Anyway, I´m happy about each and every comment. And I´m looking forward to seeing more and more public quality reports on software of all sorts coming online. Maybe I should set up a dedicated site...?
Have you ever thought about the quality of your code? Well, I bet. Have you ever strived for a sound architecture of your software solutions? Well, I hope. Do you have a process in place to constantly monitor the quality of your software´s architecture and code? Well, you should. But not only should you. Every single software team should. Planning for quality and constant qualitiy monitoring should be the fundamental activities in any software project.
Sounds self-evident? Well, it does, but still it seems to be very, very hard for many software teams to implement such self-evidence. Why? There seem to be several reasons like a tradition to focus mainly on functionality/correctness and performance or a lack of formal education or a comparatively small number of tools.
So the quality situation still is pretty dire in the software industry despite all the discussions around pair programming, unit testing, patterns, refactoring, modelling approaches etc. But how can this be improved? Of course, more quality conciousness and better education helps. Easy to use tools, though, would help even more, because they would cater to constant lack of time of many programmers. They need to be so focused on churning out code that their time to look left and right to learn a new tool not helping them to code faster is very limited.
The xUnit family of tools is an example for such a quality improvement tool. The concept is easy to grasp, the tools integrate well into the IDE or at least into the development process, and quality checks are easy to set up. Great! But this helps only code correctness.
What, for example, about the structural qualities of a software? Refactoring tools like ReSharper are targeting this, but are working only on the architectural micro level of classes; plus they are not conerned with analysis. You have to smell necessary refactorings and then use the tool.
NDepend is another tool assessing software structure quality. But its output is hard to read and read-only.
Then there is Sotograph which might be the most comprehensive single tool to analyse even large software systems. I like it quite a bit, as my report on a recent workshop shows. However, I think it´s not really easy to use and still to expensive for many programming shops.
Dependency Structure Matrix
The other day, though, I stumbled over yet another software quality assessment tool: Lattix LDM. And what impressed me immediately was it´s easy to understand output based on the intuitive concept of the Dependency Structure Matrix (DSM).
Instead of trying to visualize complex systems (of classes or components) as a usual dependency graph like this:
Fig. 1: Some Python classes
Lattix LDM shows them using a matrix, e.g.
Fig. 2: The Spring.Net Framework
Isn´t that a much clearer picture? More structured? Who wants to follow all those lines in dependency graphs once the number of elements surpasses maybe 20 or 30? It becomes a maze!
If, yes, if a software has a decent structure, a dependency graph makes sense because artifacts can be arranged in groups and layers and dependencies are mostly unidirectional. But what if the macro and micro structure of a software is not decent - which is very likely for many software projects? Then a dependency graph becomes just a sophisticated wallpaper. Dependency graphs simply don´t scale.
DSMs on the other hand scale well. Take for example this view into the architecture of the .NET Framework:
Fig. 3: A large scale overview of the .NET Framework Base Class Library
Within one minute after selecting the .NET Framework assemblies I was able to browse through the architecture of the base class library. Arrangement of assemblies in "layers" (or basic top-down dependencies) was done by Lattix LDM automatically - as far as possible.
That´s what impressed me about the Lattix tool: ease of use, quick results, simple concept (DSM). Although it´s still Java based, has comparatively slow GUI (due to Swing?) and does not have (at least on my Windows machine) the polished look I´d like for my tools, it was of immediate value for me.
A DSM view on software is merciless in its clarity. You don´t get mired in tedious layout work, but rather the tool strips naked the software you throw at it. You even see the bare bones of it, the dependency skelleton on any level you like. The physical level of assemblies and types comes for free within a couple of seconds. Even a rough layering can be established automatically. Beyond that you can group artifacts manually to arrive at a more conceptual view.
Although I call a DSM view intuitive and clear, it needs a little explanation. So how does it work? There is some pretty introductory literature online, e.g. , but much more already describes various applications of DSM. Otherwise it seems, the basics are explained only in expensive books like . Luckily, though, the manufacturers of Lattice LDM provide quite a good introduction to (their use of) DSMs themselves. Start here and then check out their white papers.
Right now let me just sketch the basic patterns to look out for:
The matrix shows all analyzed artifacts as rows and columns. At the intersection of a column with a row the strength of the dependency is noted.
In the DSM of Fig. 4 you can see how TopLayer is dependent on MediumLayer since column 1 (corresponding to row 1, TopLayer) has written a 1 in the cell where it intersects with row 2 (MediumLayer).
So reading a DSM in column-row order (col, row) means "depends on" or "uses", e.g. (1,2)=1 depends on 2, TopLayer uses MediumLayer.
Using (row, col) instead means "is used by", e.g. (2,1)=2 is used by 1, MediumLayer is used by TopLayer.
If you give up strict layering and allow higher layers to depend on any lower layer, you still get a DSM where all dependencies are located below the main matrix diagonal. With this pattern in your mind look back at Fig. 2 above. I guess you immediately get an impression of the overall architectural quality of Spring.Net: It´s pretty neatly layered; most dependencies are located in the lower triangle of the DSM.
Fig. 4 so to speak shows the architectural ideal: a strictly layered system. Each artifact (be it classes or assemblies or conceptual groups) is only dependent on one other artifact layered below it. This leads to a very scarcely populated DSM where all dependencies are parallel to the main diagonal of the matrix. To get the same "feeling" by looking at a dependency graph would take you much longer, I´d say. Remember: I´m talking about a system you did not do the design for (or which has deviated much from you initial design). Also, in a larger dependency graph it´s hard to see in which direction the dependencies run. With a DSM, though, you just check, if a dependency is in the lower or upper triangle.
Once you start to group artifacts into conceptual subsystems you probably want to know, how far they are "visible", how far reaching their dependencies are, whether they cross the subsystem boundaries. Fig. 6 shows a simple system with two subsystems and in the DSM you can see, that MediumLayerValidation and BottomLayerPersistence are used only within their respective subsystems: Any dependencies on them are located within their subsystem´s square. These artifacts are thus only of local relevance/visibility.
The bane of all architectures are cyclic dependencies among artifacts. They increase the complexity of a software and make the build process difficult. Since long thus one of the basic software architectural principles is: Avoid cyclic dependencies, instead strive for a directed acyclic graph (DAG).
Since DAG is the goal it´s important to be able to quickly spot any violations of this principle. In dependency graphs of even modest complexity this is difficult. But with a DSM it´s easy. Look at Fig. 7: Cycles appear as dependencies above the diagonal. Reading in column-row manner we find: (5,6)=Utils depends on Security, but also Security depends on Utils (6,5).
Once you decide to arrange your artifacts in a certain order from top to bottom thereby suggesting a layering, any dependencies not following this direction from top to bottom show up in the top triangle. And if they don´t go away after rearranging the artifacts - e.g. by moving Utils below Security - you are truely stuck with a cycle.
Again look at Fig. 2 and zoom in on a part of the DSM showing a seemingly heavy cyclic dependency:
Fig. 8: Cyclic dependencies seem to exist within a part of Spring.Net
Almost all dependencies of within subsystem Spring.Objects.Factory are located above the diagonal. No good.
But we can rearrange the "layers" within the subsystem and arrive at a much nicer picture:
Fig. 9: By rearrangement of some artifacts almost all cyclic dependencies within the part in question of Spring.Net could be resolved
Almost all cycles are good. And the layering sounds ok: Xml is the top layer depending on all lower layers, * is the basic layer, the "catch-all" layer servicing the others.
However, some cycles are still present: Even though Support heavily relies on Config, Config in turn also depends on Support. If this is acceptable or not would now require closer examination. I´m not familiar enough with Spring.Net as to be able to assess, if such a cyclic dependency is tenable. But the DSM easily drew my attention to this area of pontial problems. Try this with a dependency graph of a system you don´t know intimately.
After these preliminary explanations let´s have a look at the .NET Framework. What´s it´s architectural quality? Will Microsoft tell you? No. So why not have a look for yourself? You probably have looked at the innards of the BCL yourself many times using Lutz Roeder´s Reflector. It´s an invaluable tool and belongs into your toolchest. But Reflector just shows you the (regenerated) "source code" of the BCL. It does not tell you anything about the overall quality of the architecture. With Lattix LDM, though, you can as easily assess the structural quality of the .NET Framework as you can assess its code quality using Reflector.
Look at Fig. 3 for example: It´s a very big picture of the .NET Framework. It shows the dependencies within and between the System.* DLLs and mscorlib.dll. The overall impression might be good. System.* DLLs seem to depend heavily on mscorlib.dll. Sounds reasonable, sounds like at least two layers. Great!
But then there are some glitches: Why does [mscorlib.dll]System.Security depend on [System.dll]System.Security? Or is it really necessary to use System.Configuration.dll from mscorlib.dll if mscorlib.dll is so basic? But that might be nitpicking. Much more questionable are the seemingly heavy cyclic dependencies within subsystems like mscorlib.dll and System.Web.dll:
Fig. 10: Cyclic dependencies abound in the most fundamental library of the .NET Framework: mscorlib.dll
Fig. 10 shows so many dependencies in the upper triangle that it´s impossible to rearrange the subsystems to improve the situation much. mscorlib.dll simply contains many cyclic dependencies. The main culprits in this regard seem to be the subsystems/namespaces System.Runtime, System.Reflection and System. At the same time they provide many services to others and use many services of others.
The large dependency numbers for System (see row/column 14) thereby suggest, Microsoft used it as a kind of "catch all" namespace, a general bucket for all sorts of stuff. But although it´s perfectly ok to have such a bucket in your system - many give it a general name like "Util" or "Helpers" - it´s strange to see this bucket to be dependent on so much other stuff. Because what that means is: the bucket functions as a change propagator. No only changes can affect much of the system, because it´s used by many other parts of the system. But also changes to other parts of the system can affect it, because the bucket depends on them - and thereby indirectly can affect again other parts. That does not sound good, I´d say.
mscorlib.dll cannot really be thought of as a layered system. Is that good or bad? Of course as a whole mscorlib.dll is a layer itself in any .NET software because all your programs rely on its basic services. It´s at the bottom of all .NET software. So far so good. But then, if you zoom in, the layering is gone. One of the most basic architectural principles is violated not here and there but pretty pervasively throughout the whole thing. That smells - as in "code smells" - bad. That smells like programming in a hurry. That smells like many cooks working on the same broth.
Fig. 11: System.Web.dll after some rearrangement
The second cyclic dependency ridden subsystem of the BCL seemed to be System.Web.dll according to Fig. 3. Upon closer examination, though, the situation is not as bad as it first appeared. Fig. 11 shows a close-up of System.Web.dll after some rearrangement of its subsystems. Quite some layering could be established - but again there are two "change propagators": System.Web.Configuration and System.Web. System.Web, the namespace root, also again is the general bucket for all sorts of stuff.
I´m sure, each one of the cyclic dependencies at some point made some sense to someone. But looking from the outside it´s sometimes hard to think of justifications for, say, dependencies of basic utility functionality (13) on the UI (6) or compilation functionality (4) on the UI as well as the other way around or cycles between compilation and configuration (14). Why has System.Web.dll not been structured in a cleaner way?
It kind of makes me shudder a bit to see the inner structure of such fundamental building blocks of all our software to be in this state. With all of Microsoft´s preaching about solution architecture, its promotion of multi-layer architectures it´s sad to see so many basic violations of these principles.
As a counter example look at another non-trivial system: the VistaDB database engine. It´s a full blown relational database engine/server written entirely in managed code and coming as a single 600 KB assembly. It consists of two large subsystems: the internal engine whose code got obfuscated and the public API. Fig. 12 shows the internal engine from 100,000 feet. It´s made up of some 400 classes, but the DSM shows, their dependencies are almost all below the diagonal (only visible as "shadows" in the pictures). VistaDB thus internally is nicely layered.
Fig. 12: Large scale view of the internal engine of VistaDB
When zooming in on the public part of VistaDB, the feeling of attention to architecture remains. Fig. 13 shows the API to nicely layered as well. The few lonely cyclic dependencies are neglible.
Fig. 13: The public API of VistaDB
Publishing software quality reports
There´s much talk on how the security of software systems can benefit from public reviews. This might work for some systems, and not for some others. I don´t know. But regardless what way is best to find security bugs I think the most import aspect of this discussion is publicly talking about software quality. In many other industries product quality (and I don´t mean functional correctness) is a main talking point for the sales force. They talk about details of how the production process is geared towards maximum quality; they wave official certificates of quality in front of you; they point out how important high quality ingrediences (subsystems) are for the product etc. For the building industry it´s even mandatory to register blueprints and quality calculations with the local authorities.
So the whole world seems to be concerned with quality and relentlessly shows it off.
But what about software companies? They claim to have great products. But this (mostly) means, they claim to offer products with a certain set of features providing value to a certain target group. Those features are assumed to be correctly implemented (which is not always true, as we all know) and assumed to perform/scale well (which is not always true, as we all know).
However, nobody in the software business really tries to show off how they planned, implemented, and monitored other quality measures than correctness or performance. That means except for Open Source software we´re all using black boxes whose quality in terms of maintainability, understandability, or architectural stability we don´t know. I might be impressed by some software, it works just fine, slick GUI, high performance. I´m willing to pay the prices. But I don´t have a clue if its manufacturer is even able to maintain it over a longer period of time. I don´t know it the manufacturer attributes any importance to a clean and maintainable architecture/structure. I just see the surface of the software (its GUI/API, its performance/scalability). But is that enough?
The more I think about it, the more I get the feeling, we need to more openly discuss the quality of our software products. And I suggest - as a starting point - to proactively publish DSMs (and other quality metrics) along with our software. Quite somecompanies invest into blogs maintained by their development staff to open themselves to the public, to gain trust. Why not add to this policy a quality report for each release of the software somewhere on the company´s website?
Or to put it the other way around: Why don´t we start to demand such quality reports from the manufacturers of our tools? Think of it: The next time you evaluate a database or report generator or GUI grid component you ask the manufacturer for a DSM. Of course they won´t know what that is, but then you explain them. Tell them you´re interested in the quality measures they apply. Tell them you just want to get a feeling for the state of the architecture they base their software on. You can reassure them you´re not interested in any code details. It´s just the structure of the system - which gives you a first hint at how quality conscious the manufacturer might be. So I´m not talking about Open Source, I´m talking about Open Quality or Open Architecture.
I think, this is what we need to start to demand if we really mean to improve product quality in our industry. A simple DSM for each system could be a start. It´s easy to understand and provides a first rough overview. But more metrics and figures could follow: How is the testing process organized? Do they use a version control system? Is there continuous integration in place? How about code reviews?
You say this is devulging too much information? Well, I say the industry is just not used to as much openess about quality measures as other industries are. And our industry is ashamed about its low quality standards. It´s simply scared to reflect (in public) upon how it ensures quality. Because publishing quality figures determined by tools like NDepend or Sotograph or Lattix DSM is not to be confused with leaking critical information on implementation details to the competition. It´s more like publishing a financial statement like all large corporations need to do.
To show you that I mean what I say, here´s the DSM (Fig. 14) for my currently only "product", a small Microkernel. It´s not much, but it´s a start.
Fig. 14: The structure of my small Microkernel
But I will continue to make public DSMs and other metrics whenever I develop software be it commercial products or code to accompany a conference presentation or a series of articles (since I´m an author/consultant/conference speaker). And I will motivate my customers to do so for their products.
I can only gain by becoming open about the internal quality of my software. I will gain trust, because trust begins by opening yourself to others. I will gain product quality, because now I´m forced to think more about it, since eventually the results of my descisions get published. And my customers - be it buyers/users of my software or readers of my articles - will gain insight which gives them a better feeling once they need to make any descions.
Think about it. What will you tell your customers, once they start looking at your software using a tool like Lattix LDM and thrown about what they see? With managed code anybody can strip your software naked as I did with the .NET Framework and VistaDB and Spring.Net. Obfuscation might help a bit here, but in the end, assessing the quality of your software´s architecture relies on names only to a limited extent. You can´t obfuscate dependencies, you can´t hide spaghetti structures.
But maybe before you publish anything about your own software you want to start a more quality conscious life by looking at other´s software. Download NDepend or Lattix LDM and point them to the assemblies of your favorite grid or application framework. Because: Once in a while nothing can beat the fun of pointing fingers at others :-)
[Continue here for some replies to comments from readers of this posting.]
 Tyson R. Browning: Process Integration Using the Design Structure Matrix, Systems Engineering, Vol. 5, No. 3, 2002
 Karl Ulrich, Steven Eppinger: Product Design and Development, McGraw-Hill/Irwin; 3 edition (July 30, 2003), ISBN 0072471468
The week before last Neno Loje and I did a workshop at the Computer Science department of the University of Hamburg, Germany, to verify a couple of our ideas on software development. We offered this workshop to the university for free, to give students of unfortunately notoriously underfinanced public educational institutions a chance to "get in touch with the real world". Their usual curriculum does not cover .NET much and their approach to software development is quite different from how real project teams work in the Microsoft universe. So we wanted to introduce them to .NET in general, but our focus was on working with a team according to how we think a software project should be approached. For that we had 5 days.
Some 15 students applied for participation in the workshop. They all had gone through a one week C# training the week before our workshop, so almost all of them were pretty new to the .NET Framework. But that was not bad for our purposes. Right to the contrary: We could assume they were not "spoiled" by current practices of professional software development. They just had to overcome university-instilled Java bias ;-)
As it turned out not all students were capable of enduring two days of theory without access to a PC.
Some brought their own laptops, some developed headaches or insourmountable sleepyness due to email deprivation ;-)
The goal of our experiment was to see how easy it was to teach an audience our view of a minimal systematic and pragmatic approach to software production from analysis to deployable code. Our approach consisted of:
- an overall process inspired by Feature-Driven Development (FDD).
- Software Cells for guiding the modelling process
- Contract-First Design for laying the foundation for easy testing and parallel implementation
- a Microkernel for easy testing and weaving together the whole program at runtime
- FinalBuilder for continuous integration
When we do consulting, we usually don´t have the opportunity to teach a team "new tricks" in such a comprehensive way. That´s why we gladly took the opportunity of the workshop.
Day 1: On day one we first selected a problem scenario to develop a software for. The students liked the idea of a pizza delivery service quite a bit. So Neno became Tonio, the owner of such a service, who wants his small company to get more efficient and open new markets by offering an online ordering service in addition to ordering pizzas by calling in or coming by. Within a couple of hours we defined several user roles and together came up with a list of some 20 essential features for the PizzaTornado solution.
With Tonios help we arranged those features into a release plan and then started to model the solution. Since the students were not familiar with a component oriented modelling approach, we first introduced our view of the overall production phases from analysis to deployment and the modelling stages from feature list to component specification. To round off the day we made a short presentation on how to use a Microkernel to dynamically bind components.
Our sketch of the software production process. Note the arched arrows back to previous phases: you see, the process is iterative.
(If you can read what´s written on the blackboard don´t worry. Neither can I anymore ;-) That´s why the students needed to take notes themselves.)
Day 2: On day two the whole group continued modelling the solution. After finding the essential 15+ components and combining them into separately running software cells...
It´s me explaining the component architecture of PizzaTornado.
Note the larger circles, that´s the six software cells of the solution, each representing a host process running its own logic.
Three of the software cells work in C/S fashion, because they run in a LAN;
two others (a mobile client and the website) use an application server software cell to synchronize their local data with the central database.
...we started to model the contracts by defining the value streams for the features of the first release (see the right flipchart sheet in the below picture).
On the right flipchart sheet you see the value streams describing the cooperation between components to "produce" a feature.
Day 3, 4 and 5 saw the students happily implementing feature after feature of the PizzaTornado solution. We started out with release 1, but later managed to also tackle features from release 2. Pairs of students were assigned the responsibility to implement a certain component. And several of these pairs were then combined to form feature teams to implement each feature.
Each component was set up in its own Visual Studio solution together with a test client project as well as mock-ups as necessary. Each component owning team just saw the sources of its component. Instead of collective code ownership we bet on clear responsibilities - and everybody liked it. There were no discussions about what this or that code was supposed to mean because component owners focused on their own code and had to deal with other´s code only through clearly defined contracts.
To our great delight this did not slow the project down when some students left the workshop. Their components were assigned to someone else who was able to rely on a set of unit tests once he started to change the code he did not write.
Also Contract-First Design turned out not to be a bottleneck. Since we only defined the contracts necessary for the features in the current release, the upfront effort was fairly small - but the gain war large: We were able to start working on many components in parallel without interference. Discussions between component owners focused on understanding the contracts, not each other´s code. What a relief! This way we held upright the decomposition of the solution during development.
Although we modelled the solution using Software Cells in the end it turned out all subsystens (consisting of one or more components) could be nicely arranged in layers.
As predicted, though, contracts were not stable after their initial design. But that did not cause trouble. When necessary we changed them and redistributed them to the component teams. Only once, when I applied some major changes too quickly without final consent from the owners of the components relying on this contract I earned some angry gnarling. Technically, though, changing contracts during component implementation is no problem.
Once each component owning team was satisfied with its work, they checked it into our central repository (unfortunately running on VSS). Our automatic build process then periodically checked out all Visual Studio component solutions, compiled them, ran the tests, and on success put the binaries in a second repository for integration tests. First we started out with a plain long MSBuild "batch" to do all this - but then we discovered FinalBuilder. It was a godsend and we can only recommend you download a trial immediately. I´d say, it does for the build process what VB1 did for graphical user interface programming.
Neno, our master of the continous integration process, in front of another successful FinalBuilder run. We all quickly got addiceted to the green bars of FinalBuilder which assured us all components compiled correctly and were tested correctly and now are available for everybody to use.
Integration of the components developed in parallel by independent owners was a snap. It even surprised me how smoothly the parts of the solution fit together. We just selected the necessary components for the different software cells, copied them to separate integration folders, and started the host EXE. And they ran without flaw from the first complete build on - although with functionality limited to the respective release. A great help was the small Microkernel we used: It did not require the setup of any mapping file but bound interfaces to implementations automatically upon startup. That way integration of a solution without static references between components was as easy as developing a program using one large Visual Studio solution - but without the need to devulge all code to all developers.
It was so relieving: We did not run into any version control conflicts, no merges were necessary, no confusion about who was responsible for what. Using the Microkernel we were able to carry over a clean model into a clean implementation. And it was so easy for the students, they even modelled the features for release 2 all on their own. Designing the GUIs for the different user roles, and designing the database could be done in parallel; following the internal component contracts were defined. Everybody felt guided by a simple to comprehend process and capable to use the intuitive graphical notation of Software Cells.
A sketch of part of the GUI of PizzaTornado. It shows the main user interface elements and all necessary operations - without going into detail how they get triggert. This kind of simple GUI design is enough to start the Contract-First Design process along feature oriented value streams through the mesh of components.
To monitor and verify the quality of our architecture and implementation, we invited Software Tomography to try out their new version of Sotograph for C# on our solution. They sent over Jan Kühl who continually assessed PizzaTornado´s state of quality.
Jan Kühl from Software Tomography in front of the Sotograph management view of his final quality measurement for PizzaTornado. As the top green shapes show: Not only our model, but the actual implementation are of highest quality according to well known architectural metrics.
And as it turned out after five days of modelling and implementation, the code was of very high quality according to the host of measurements built into Sotograph. Jan attributed this to the systematic modelling approach culminating in clear contracts and the rigorous separation of the development of the components.
Of course, the scenario was comparatively small, nevertheless the solution is not trivial. After day 3 some 12 components were under development each with several classes. So we think there was a chance stray from the path of virtue. But as the analysis showed, the implementation did not suffer from any of the many usual anti-patterns like "code duplication", "cyclic references", or "interface violations".
This made us very, very happy. We take it as a promising sign for being on the right path with our quiver of concepts and technologies.
Despite this satisfying bottom line we also learned a lot, e.g.:
- A domain model should be modelled at the same time as the database. The domain model is the fundamental contract.
- For components on which many others are dependent mockups (or lightweight versions) should be made available by the respective owners pretty early in the development process.
- An explicit build/integration step is not only useful to check the quality of the code, but also as a way to reflect on the architecture. When assembling the build script you sure will discover any cyclic dependencies among the components. That´s your chance to resolve them.
- Don´t take anything for granted when booking a university´s computer lab! You should specify every detail of how the computers need to be set up - from the basic development tools down to the last TCP port. Or even better: get admin rights for all computers. You´ll need them.
Thanks to all who participated in the workshop! Thanks to the University of Hamburg to letting us do the workshop. Thanks to Microsoft Academia - Markus Kobe - for supporting the workshop.
I think true component oriented programming requires Contract-First-Design (CFD) and usage of a Microkernel
to bind contracts to implementations at runtime. But while CFD is a matter of your resolution and will, using a Microkernel requires a piece of technology fitting into your existing toolchest. There are of course a couple of these technologies readily available - like Spring.Net
-, but they are not lightweight enough to my taste. It´s hard to enter the world of dynamic binding using them. That´s why I devised a small Microkernel of my own as an easy starting point and platform to try out a couple of ideas.
If you like, you can download Ralf´s Microkernel here. It´s written in C# and comes with a couple of unit tests (NUnit style). Feel free to use the binary in your projects and play around with the source. Although it´s not much code, I hope you will be able to gain quite a bit from its essential Microkernel functions. Here is how it works...
I don´t want to explain CFD at length here, but to set the scene for the Microkernel some hints as to what it means are in order.
CFD views software as consisting of components. To avoid any heated discussion on what this term could possibly mean, let me try a very practical explanation: Think of a component as an assembly or a set of related assemblies. This assembly (or set of assemblies) you want to use together with other components to build a software solution.
Don´t think of "components are made for reuse" or "components require a market" or "components as units of independent versioning". That´s all good and well, but can get in your way when starting out with components. So just think: Components encapsulate some functionality I want to separate from other functionality in other components.
Now, when using components there are two basic kinds of them: client components (or consumers) and service components (or producers). A client component uses one or more service components to do its work.
When using VS2005 you´d set up two projects, one for the client component and one for the service component and then the client component would refrence the service component. That´s necessary to instanciate service component classes.
Although this is how most of the developers do it most of the time, it´s not how you should do it. There are at least two reasons, why this is hampering your work:
- For to code the client component you need any service component to already exist. Only existing service components can be referenced in VS2005 projects. That´s ok for the grid component you use - but what about the components you develop yourself? You´d need to develop them strictly bottom-up. But that´s bad for your productivity.
- When testing the client component it always needs to use the real service component. That might be bad for your testing speed, though, because the service component might need a long time for its operations. Testing would be easier if you could exchange the service component for some stand-in or mock-up to ease testing of the client component. But usage of the service component is hard wired into the client which instanciates classes from it.
So, what can you do to overcome these hurdles in component oriented software development?
CFD changes the above picture like this:
A client component no longer references its service components, but only so called contracts. Each contract describes the services of a service component in terms of interfaces and other data types independent of any service implementation. The contracts of one or more components can be assembled in a contract assembly. However I prefer a one-to-one mapping: each contract gets its own assembly and each assembly is contains only one contract.
Contract-First-Design then means, you define all contracts before you implement them. You create contract assemblies before their service components which implement the interfaces described in their contracts.
Then - and this is crucial to CFD! - client components only reference contracts. Client components don´t reference service components anymore! During design time client components don´t know which service components will fulfill their requests during runtime. There is no more static binding between client and service components.
That´s why you can´t instanciate classes from a service component anymore in a client component. You simply don´t know the class implementing a service contract´s interface.
This might sound grim to you - and it is considering support of this style of programming in VS2005.
But don´t despair! Help´s on its way.
But first: How does CFD solve the above two problems?
- Since you no longer reference a service component in your client but just its contract assembly, you are free to develop components in any order you like or in parallel. Only the contract assemblies need to be present - that´s why it´s called Contract First (!) Design.
- Since service components are no longer referenced there are no dependencies on their types in a client component´s code anymore. As long as some component is available during testing which adheres to the contract required by the client, the client can be tested. There needs just a way to map the contract to a concrete implementation during runtime. If this implementation then is just a mock-up or "the real thing" is of no concern for the client component´s code.
A Microkernel to the rescue
So if there is no static binding between client and service anymore, how and when is a client component then bound to a service component? This binding happens dynamically during runtime - and is done by a Microkernel. (I prefer the term Microkernel (or the simpler Service Locator) instead of Dependency Injecttion or Inversion of Control framework, since I want to keep the concept of dynamic binding separate from the initialization of objects and am skeptical about automatically weaving large dependency networks.)
A Microkernel helps a client component to overcome its ignorance regarding the service implementation. A Microkernel makes the impossible possible: it lets you instanciate an interface! Instead of
... = new MyServiceClass();
you write something doing the equivalent of
.. .= new IMyService();
Or to be more concrete, here´s how you´d do it using my little Microkernel:
... = DynamicBinder.GetInstance(typeof(IMyService));
Of course a Microkernel is not smarter than VS2005 or the .NET Framework. So in order to accomplish its feat, the Microkernel needs some help. Before it can instanciate an interface it needs to know to the implementing class. Please note: The Microkernel needs to know this mapping between interface (contract) and class (service implementation), not the client component. This might sound like a small difference compared to how you usually work, but in fact it is not. It´s a major difference separating would be component orientation from real and true component orientation.
To help the Microkernel you need to provide it with the required mapping information: For each interface of a contract a client should be able to instanciate you need to state the class to instanciate behind the scene. You need to bind interfaces to implementations, e.g. "Interface ServiceContract.IMyService is implemented by class AllMyServices.MyServiceClass in assembly AllMyServices.dll."
This mapping can be done either imperatively:
or in a XML mapping file (which can be the app.config of your program):
<?xml version="1.0" encoding="utf-8" ?>
type="ralfw.Microkernel.ConfigSectionMappingAdapter, ralfw.Microkernel" />
type="AllMyServices.MyServiceClass, AllMyServices" />
That´s it! More you don´t need to do.
- Design your contracts and formalize them as assemblies in their own right.
- Bind client components to contract assemblies instead of service components.
- Define the mapping between contract interfaces and service component classes implementing them.
- Use a Microkernel to instanciate service component classes through their contract interfaces.
Easy, isn´t it?
The downside of using a Microkernel
Well, you could stop reading here and start using my little Microkernel or any other like dynamic binding framework. They all work alike.
However, sooner or later you´ll find a couple of things disturbing the rosy picture of just outlined:
- Setting up the mapping - even a declarative mapping like the XML above - is tedious and error prone. If you truely subscribe to component oriented programming and decompose your software solution into a larger number of components and also keep these components in different (!) VS2005 solutions to enforce component oriented thinking and ease of parallel development, then you´ll end up defining a lot of mappings. And each one, even the most straightforward ones, require some setup effort.
- If your client components no longer reference service components directly then VS2005 cannot help gather all assemblies in the execution directory of a solution´s startup project. You´ll have to somehow copy the relevant files manually before running a project.
- Components often read some parameters for their work from an external file. The .NET Framework provides an easy means for this via its System.Configuration.ConfigurationManager object. Just put any parameters into the app.config of your program. This might be just fine as long as you develop your software within a single large VS2005 solution. But as soon as you set up separate VS2005 solutions for each component you´ll need to merge settings from several app.config files whenever you integrate several components.
This is the downside of using a Microkernel: If you subscribe to CFD and want to let the organization of your code reflect your component oriented design, then you loose quite some convenience provided by VS2005.
Or to put it the other way around: Microkernels are a great way to reach higher productivity and better testability - but this comes at quite some price. It´s less convient than using VS2005 like Microsoft is promoting it.
Since I deem the Microkernel concept core to true component orientation and found it unnerving to deal with the above problems, I´m trying to overcome them with my own little Microkernel and some guidance.
Solution 1: Semi-automatic gathering of assemblies
Once you no longer reference service components but just contracts, VS2005 does not help you anymore gathering all necessary assemblies in your EXE-project´s bin directory. Fortunately it´s quite easy to "simulate" the VS behavior. Here´s my guideline:
In the VS2005 project supposed to integrate several componets, add the components´ assemblies manually using the "Add existing item...". Be sure to link (!) them and set the "Copy to output directory" flag to true!
Here´s an example: Consider the following simple component oriented architecture.
Three components. The frontend is the client of the businesslogic, which in turn is the client of the DB adapter. Two contract isolate the clients from any concrete implementations of the services they depend on.
Following CFD the contracts are defined first and then implemented as distinct VS2005 projects:
Since the DB adapter does not rely on any further service component, it´s straightforward to implement:
Put it in a VS2005 solution of its own and add a test project which could contain unit tests for the component. The whole solution I call a component test bed, because it not only contains the component itself but also any tests and possible mock-up (as stand-ins for service components required by the component under development).
Enter the Microkernel: The client of the DB adapter is the businesslogic component. It too is put into a test bed solution of its own:
Since the businesslogic needs the service of the DB adapter, it refrences the respective contract as well as the Microkernel. In code it then instanciates the DB adapter using the DynamicBinder class:
public class Businesslogic : contract.IBusinesslogic
private microkernel.demo.dbadapter.contract.IDBAdapter db;
db = (microkernel.demo.dbadapter.contract.IDBAdapter)
Using the Microkernel in the businesslogic means, any integrator of components including the businesslogic needs to provide a IDBAdapter implementation. In the above sample that´s the task of the test businesslogic project.
But since neither it nor the businesslogic can or should reference the DBAdapter component in the regular way, how can this be accomplished? How can the test project provide a IDBAdapter implementation during runtime? By linking the DBAdapter component into the project (see highlighted project item above) and setting its "Copy to Output Directory" property to true. VS2005 will then see to that the bin\debug or bin\release directory will always contain the latest copy of this file. And that´s like VS2005 behaves for ordinarily referenced assemblies.
If you want to be able to step into a component linked into a project like this be sure to also link in its .pdb file. The same is true for any supporting files the component needs (e.g. other assemblies it depends on or configuration files).
By following this guideline you can get the same level of convenience you´re used to from using VS2005 and regular assembly references. It requires slightly more effort - but it ensures you can reap the benefits of using a Microkernel.
To sum up:
- A project implementing a contract references the contract assembly as usual. It´s exporting this contract. See the DBAdapter project above.
- A project using a contract references the contract assembly as usual. It´s importing this contract. In addition it references the Microkernel and uses it to instanciate contract interfaces. See the Businesslogic project above.
- A project hosting components which use the Microkernel at least needs to link in the assemblies of any service component required in the described way. See test project for businesslogic above.
Solution 2: Automatic mapping
Once all required service components have been gathered in a host´s runtime directory a mapping has to be built. Any contract interfaces to be instanciated by client components need to be bound to classes in service components implementing them.
This is usually done using a declarative mapping in the app.config of the hosting project (e.g. the test project for businesslogic above) or in a separate XML mapping file. Here´s a sample app.config:
<?xml version="1.0" encoding="utf-8" ?>
type="ralfw.Microkernel.ConfigSectionMappingAdapter, ralfw.Microkernel" />
type="microkernel.demo.dbadapter.DBAdapter, microkernel.demo.dbadapter" />
Please note the config section handler defined at the top. It´s needed for the Microkernel to access the <microkernel> section. This mapping section can be either named <microkernel> or <spring>, because its <objects> element looks like the one from Spring.Net, although it´s stripped down to the essential mapping information.
Each <object> binding consists of a name for the binding and the type to be instanciated for this name. The type information is given as the fully qualified class name followed by the assembly name containing the class. It´s the usual .NET Framework format for such kind of information.
The name of the binding is arbitrary - however I prefer to set it to the fully qualified name of the interface type to be instanciated and which is implemented by the type referenced. The Microkernel supports this convention by providing a GetInstance() overload which takes an interface type and uses its full name as the name for the binding to look up in the mapping.
In order to load mappings from the app.config call
right after start of the hosting code.
If you want to keep the mapping info in a separate XML file like this:
<?xml version="1.0" encoding="utf-8" ?>
type="microkernel.demo.dbadapter.DBAdapter, microkernel.demo.dbadapter" />
LoadBindings() will read the bindings, build a map and load the referenced assemblies. When you later call DynamicBinder.GetInstance() the Microkernel will look up the name and instanciate the type referenced implementation type. Either a new instance is created each time or a singleton is created once. This you can determine by setting the singleton attribute of the mapping.
The default is false.
That´s all nice and well - but it´s quite cumbersome. Especially when you start using a Microkernel you´ll hate to get the mapping right first before you can see your program running. That´s why my little Microkernel provides a third way of setting up the mapping.
When you call
the Microkernel will determine the mappings all by itself. It will scan all assemblies in the runtime directory of the host and check if they contain contract interfaces or contract interface implementations. Interfaces will automatically bound to implementations. No explicit mapping is needed on your side. Just ensure all service component assemblies are there.
In addition only one small hint you need to give to the Microkernel: You need to annotate contract interface which client components should be able to instanciate with an attribute:
public interface IDBAdapter
That´s all. The Microkernel will now be able to spot relevant contract interfaces and find any class implementing them in the assemblies present in the runtime directory.
One word of caution: This is a feature to help you enter the realm of true component oriented development. It´s targeted at pretty small projects with maybe 10, 20, 30 assemblies in the runtime directory. So please don´t blame it on the Microkernel, if CompileBindings() takes too long for your taste, if you use it with hundreds of components. It´s not targeted for such large scenarios - at least not performance wise.
Solution 3: Distributed settings
Once you start developing components in a way like I described above, where you set up test bed solutions for each component, you´ll soon realize, it´s hard to merge all the different app.config settings required by the components. In many test beds you´ll define special settings for just this component which later will have to be copied to the app.config of the integrating host project. That´s tedious and error prone work.
To make the handling of settings (or external component parameters) easier, I propose component local config files. Here´s an example:
Let´s assume the DBAdapter requires a connection string it wants to load from an external setting file. Instead of putting it into the app.config create a XML setting file in the DBAdapter´s project with a name of your choice, e.g. microkernel.demo.dbadapter.config. It could look like this:
<?xml version="1.0" encoding="utf-8" ?>
<add key="connectionstring" value="server=..."/>
Then add a reference to the Microkernel to the DBAdapter even though it does not want to instanciate any service components. Rather the Microkernel is used to load the setting (instead of the usual System.Configuration.ConfigurationManager):
public class DBAdapter : contract.IDBAdapter
public int LoadValues(string filename)
connectionstring = ralfw.Microkernel.ConfigurationManager.Settings("microkernel.demo.dbadapter.config")["connectionstring"];
The Settings() method returns a ConfigFile object which - for now - allows access to <appSettings> settings in the file specified.
In order to have your own config file available where it´s needed during runtime, link it into the integrating host project like a component assembly (see highlighted file in the above picture) and set its "Copy to Output Directory" flag to true.
This way you can set up individual settings for each component in several XML files and avoid merging them into a single file for integration. Instead you just link in all necessary settings files and use a single API to access them, the Microkernel´s ConfigurationManager.
This is close to how the .NET Framework handles such settings. So it´s easy to understand and use. Currently the ConfigFile class is limited to accessing <appSettings> only, but if distributing settings in this way appears practical, then it´s easy to add support for config section handlers.
Even though my little Microkernel does not sport an array of features like Spring.Net etc. it does a decent job of dynamically binding contract implementations. But what´s most important, it´s much easier to start with than other service locator frameworks, since it can build the interface-implementation mapping automatically and let´s you keep component settings separate.
The guideline for linking in components manually helps you to enforce a true distributed development of components in their own test beds and still benefit from some VS2005 help during integration. That´s ok for a start, I´d say. But in the long run, I want to provide some tool support for this too. But that´s a different story. Stay tuned!
For now, have fun using my little Microkernel and feel free to let me know what you think of it.