January 2006 - Posts

In der nächsten Woche, am16.1.2006, beginnt wieder die traditionelle OOP Konferenz in München: http://www.oopconference.de

Am Dienstag 17.1.2006 findet dort der .NET Tag statt: Ein Tag mit Vorträgen rund um die .NET Plattform. Auch in diesem Jahr war ich wieder Content Manager für den .NET Tag und habe mich diesmal für das Thema Architektur entschieden. Allerdings war mit wichtig, dort nicht nur die inzwischen übliche SOA-Fraktion zu Worte kommen zu lassen, sondern einen Bogen von der in-proc Komponente bis zum Service zu spannen.

Es werden alte Kempen der .NET Szene sprechen (Christian Weyer und Ingo Rammer, www.thinktecture.com), aber ich freue mich besonders, Prof. Johannes Siedersleben (Autor des lesenswerten Buches "Moderne Software-Architektur") und Gregor Hohpe (Autor des lesenswerten Buches "Enterprise Integration Patterns") für Vorträge gewonnen zu haben. Sie werden Blicke in die Architektur von Applikationen und auf die Kommunikation von Applikationen werfen.

Und nun die Freikarten: Wer Interesse hat, spontan den .NET Tag zu besuchen, kann sich bei mir melden, um eine von drei noch vorhandenen Freikarten für den .NET Tag zu ergattern, die der Veranstalter SIGS-DATACOM dankenswerterweise zur Verfügung gestellt hat. Ich denke, ein Ausflug zur OOP und nach München lohnt sich. Vielleicht finden wir auch Zeit zum Plaudern.

Wie wär´s? Es gilt: first come, first serve. Am besten einen Kommentar zu diesem Posting abfassen, den eigenen vollständigen Namen und die Email-Adresse eintragen und abschicken. Ich bekomme ihn dann per Email.

Leider sind die Freikarten nun schon vergeben. Tut mir leid für all die, die das Posting erst jetzt lesen. Vielleicht entscheidet sich der eine oder andere aber dennoch, einfach mal beim .NET Tag vorbei zu schauen. Wäre schön.

Posted by Ralf Westphal | with no comments

To refresh myself from obscure information models like Pile and restless thinking about software architecture I decided to pick up a new hobby: compiler construction ;-) Or rather I decided to apply my long time interest in this topic to a good cause: When I talked to Niklaus Wirth - the father of Pascal - in November at the iX conference he told me, he was about to release an updated version of his book "Compiler Construction" (read the complete book here: [PDF]) - which happens to be one of my all time favorite computer text books.

However, when I asked him, if he intended to bring his classical introduction to the world of .NET, he declined. He said, he deliberately chose Oberon as the implementation language for all examples in his book.

Now, since I´m a great fan of the book as well as the .NET Framework, I have the idea of translating all code samples in the book to C# to make the text more accessible to .NET developers. This sure is some undertaking and I don´t know, if I will find the time to really complete this effort - but at least I want to start.

So, here is my first very small piece of the puzzle: a set class. It´s very handy when you write a parser to check, if a symbol belongs to a certain set of expected symbols, e.g.

IF symbol IN [number, identifier, leftParent, minusOp] THEN ...

This checks, if the current symbol of the parser is a number or identifier etc. which means, if it might signify the start of an expression.

A set data type exists in Pascal and Oberon, but not in C# or VB. That´s pretty sad, because it´s sometimes quite convenient. Of course there exist some implementations like [1] and [2], but I thought, why not try it again using .NET 2.0 Generics? Also I though, enum data types were an ideal foundation to build a set on.

Why not be able to write:

public enum Colors
{
   red, blue, green
}

Set<Colors> sc = new Set<Colors>();
sc.Add(Colors.red);
sc.Add(Colors.green);

Set<Colors> sc2 = new Set<Colors>();
sc2.Add(Colors.blue);

sc.Add(sc2); // union

Console.WriteLine(sc); // prints: [red,blue,green]

With such a set data type setting up a scanner and parser would be as easy as with Pascal, e.g.

public enum Symbols
{
   identifier, number, string, leftParent, rightParent, ...
}

Set<Symbols> firstExpressions = new Set<Symbols>();
firstExpressions.Add(Symbols.identifier);
firstExpressions.Add(Symbols.number);
...

Symbols currentSymbol;
...

if(firstExpressions.Contains(currentSymbol))
{
   ...
}

I find using enums as a starting point very convenient. They look pretty much like the Pascal sets, but just lack any operations. So I devised a generic Set data type in C# to wrap an enum. So far it works ok for me:

    public class Set<TEnum>

    {

        #region "Data"

        bool flagsEnum;

        System.Collections.BitArray members;

        System.Collections.Generic.Dictionary<TEnum, int> mapMember2Index;

        #endregion

 

 

        #region "ctors"

        public Set()

        {

            Initialize();

            members = new System.Collections.BitArray(mapMember2Index.Count);

        }

 

        private Set(System.Collections.BitArray members)

        {

            Initialize();

            this.members = members;

        }

 

        private void Initialize()

        {

            if (typeof(TEnum).BaseType != typeof(System.Enum))

                throw new ApplicationException(string.Format("Generic type parameter <{0}> is not an enum type!",
                                                             
typeof(TEnum).FullName));

 

            flagsEnum = typeof(TEnum).GetCustomAttributes(typeof(System.FlagsAttribute), true).Length > 0;

 

            int i = 0;

            mapMember2Index = new Dictionary<TEnum, int>();

            foreach (TEnum m in Enum.GetValues(typeof(TEnum)))

                mapMember2Index.Add(m, i++);

        }

        #endregion

 

 

        #region "Working with the set"

        public void Clear()

        {

            members.SetAll(false);

        }

 

 

        public Set<TEnum> Add(TEnum member)

        {

            members.Set(mapMember2Index[member], true);

            return this;

        }

 

        public Set<TEnum> Add(Set<TEnum> otherSet)

        {

            if(otherSet != null) members.Or(otherSet.members);

            return this;

        }

 

 

        public Set<TEnum> Remove(TEnum member)

        {

            members.Set(mapMember2Index[member], false);

            return this;

        }

 

        public Set<TEnum> Remove(Set<TEnum> otherSet)

        {

            if (otherSet != null)

                for (int i = 0; i < members.Count; i++)

                    if (otherSet.members[i])

                        members[i] = false;

            return this;

        }

 

 

        public Set<TEnum> Intersect(Set<TEnum> otherSet)

        {

            if (otherSet != null) members.And(otherSet.members);

            return this;

        }

 

 

        public bool Contains(TEnum member)

        {

            return members[mapMember2Index[member]];

        }

 

 

        public int Cardinality

        {

            get

            {

                return members.Length;

            }

        }

        #endregion

 

 

        #region "Overrides"

        public override bool Equals(object obj)

        {

            return this == (Set<TEnum>)obj;

        }

 

        public override int GetHashCode()

        {

            return members.GetHashCode();

        }

 

 

        public override string ToString()

        {

            string[] names = Enum.GetNames(typeof(TEnum));

 

            StringBuilder memberNames = new StringBuilder("[");

            for (int i = 0; i < members.Count; i++)

                if (members[i])

                {

                    if (memberNames.Length > 1) memberNames.Append(",");

                    memberNames.Append(names[i]);

                }

            memberNames.Append("]");

 

            return memberNames.ToString();

        }

        #endregion

 

 

        #region "Operators"

        // intersection

        public static Set<TEnum> operator &(Set<TEnum> left, Set<TEnum> right)

        {

            System.Collections.BitArray result = new System.Collections.BitArray(new bool[left.members.Count]);

            result.Or(left.members);

            if(right != null) result.And(right.members);

            return new Set<TEnum>(result);

        }

 

        // union

        public static Set<TEnum> operator |(Set<TEnum> left, Set<TEnum> right)

        {

            System.Collections.BitArray result = new System.Collections.BitArray(new bool[left.members.Count]);

            result.Or(left.members);

            if(right != null) result.Or(right.members);

            return new Set<TEnum>(result);

        }

 

        public static bool operator ==(Set<TEnum> left, Set<TEnum> right)

        {

            if (right != null)

            {

                for (int i = 0; i < left.members.Count; i++)

                    if (left.members[i] != right.members[i]) return false;

                return true;

            }

            else

                return false;

        }

 

        public static bool operator !=(Set<TEnum> left, Set<TEnum> right)

        {

            return !(left == right);

        }

        #endregion

 

 

        #region "Iterator"

        public IEnumerator<TEnum> GetEnumerator()

        {

            TEnum[] memberValues = (TEnum[])Enum.GetValues(typeof(TEnum));

            for (int i = 0; i < members.Count; i++)

                if (members[i])

                    yield return memberValues[i];

        }

        #endregion

    }

 

The only drawbacks so far: The generic parameter type TEnum cannot be constrained to the base class System.Enum, and I left out handling bit field enumerations. (Currently if you pass Colors.red | Colors.green to Add() it will crash.) (Yes, yes, and the class is not thread-safe, I know.)

Enjoy!

PS: It seems there is one less excuse to not start translating the Compiler Construction sources... Let´s see when I really find time for that. Stay tuned!

[1] A Generic Set Data Structure
[2] Yet Another C# set class

I fired a little too fast when I did my previous posting. Although I wouldn´t say what I wrote is wrong, I´d say it´s not in proper shape yet. There are quite some aspects (sic!) to take into account concerning software design and I´m not yet satisfied with what I have. Software Cells and Software Universe was a good starting point, they help a lot in practice - but still they are limiting and need to be refocused. So for the moment you should forget about the "Classifyiing bubbles" part of the previous posting, where I introduced very concrete levels of abstraction for software. They are not entirely wrong, but I´d like to present them differently in the future. However, I´m still content with the new emphasis I put on edges! They will become even more important today.

Since I´m still not clear about all details of the next evolution of the Software Universe, please regard what follows here as thinking aloud. I just need to put my thoughts down somewhere... and isn´t that the purpose of a blog? ;-)

What´s complexity?

The longer I try to grasp software in its entirety or its very nature, the more I´m aware of its complexity. Software is not only complicated, no, it is complex. "Complex" stems from the latin complexus meaning to encompass, to braid. So complexity enters whereever different aspects or contexts or viewpoints or logics or systems are interwoven.

Now, a single context or viewpoint or system usually can be charaterized by a tree, a hierarchy. Such a hierarchy allows us to break up something complicated into smaller, easier to understand pieces. And in the end the tree is the space of all that´s within the system or belongs to a context. Thus a hierarchy defines a boundary between the inside and the outside of a system. Take a formal language as an example: The language definition is a tree of productions describing which sequences of letters are valid sentences and thus belong to the language. Or take the hierarchy of software artifacts constituing an assembly: there´s the assembly at the top, it contains type definitions, they in turn contain data definitions and methods, which contain statements. Or take the organigram of your company as an example: boss at the top, department heads below, clerks and workers below them etc. Or take your body as an example: the whole is you body, but it´s made up of organs, which are made up of cells etc.

So much for describing complicated systems using hierarchies. Now enter complexity: If you take two or more complicated systems and "combine" them, you get complexity. A person for example is a complex entity, since it belongs to several systems at the same time: there is a biological system (human body) which at the same time is part of a sociological system (e.g. society) and at the same time is part of an organizational system (e.g. in a company) and at the same time is part of familial system and at the same time is part of the traffic system etc. (Whereas each of the systems of course can be complex itself.)

Software is complex

Back to software: I think we all agree, software can be described by a tree or hierarchy of bubbles, like I did in my previous posting:

The exact terms for the nodes in this hierarchy are not important at the moment. My point just is: software can be viewed as a tree of nodes - which are even interconnected.

Now comes the twist: a single software system can be viewed from several different perspectives...

...and each perspective again is defined by its own hierarchy! You determine, what those views are, but the most dominating one sure is the problem domain.

Whenever you set out to design a new software system, you will start to find it´s hierarchy of parts to fulfill the functional requirements stated by your customer. While you do that, though, you´ll stumble across aspects of the system, that don´t really fit into that view. An encryption component does not really add anything to the functionality, but serves a non-functional security requirement. Or what about the question where to deploy the different parts?

Both security and deployment (and a lot of other concerns) can be described by their own hierarchies of notions which also contain the software parts you initially set out to find. And that´s where complexity is created: certain entities become part of several hierarchies at the same time. The different hierarchies, so to speak, are braided together at and by those entities forming a poly-hierarchical mesh:

The above picture shows just two very simple hierarchies for a software system: on the left side the structural hierarchy of software artifacts is depicted. There is a Solution (in line with the Software Universe) consisting of two Applications. The right side sketches the hardware equipment to run the Solution consisting of a LAN with three computers. These two hierarchies are woven together by mappings between Applications and computers: Which software entity is going to run on which hardware entity? Software and hardware systems are related with each other to form a complex whole. The relationships are transcending the hierarchical boundaries.

Now add 3 or 5 or 10 views to a software system and make each view more complicated (i.e. staff its hierarchy with more nodes). What you get is a very, very complex overall system.

Getting rid of bubbles - or: relationships are everything you need

So far, what I said is not really new, even if you haven´t thought of software like this. It´s just the nature of complex systems, software being one of it. Nevertheless I thought, laying the grounds for what I´m gonna say now, would be helpful. Ok, buckle up... here comes the "thinking aloud" part...

Looking at the picture above, we can see two kinds of relationships: a hierarchical relationship (some entity contains others) and a relationship across hierarchies (some entity is connected to some other entity in another hierarchy). Also, entities can of course be connected within a hiearchy (e.g. to signify communication between them).

In addition to the relationships there are entities. When you first think about entities you sure view them as tangible "hard facts". I´d like to ask you, though, to not do so. Don´t see those entities as blobs or black boxes - but as bubbles with structure. Each bubble then consists (or can consist) of smaller bubbles. And those bubbles again can consist of even smaller bubbles etc. What you then end up with is... bubbles or entities are not so important anymore. What´s more important are the relationships between bubbles. Bubbles so to speak shrink to dots connected to other dots. And then you promote connections to the levels of dots, you make connections as tangible as bubbles thereby giving up any difference between vertices and edges, bubbles and connections.

What you´re left with are just binary relations.

Let that sink in a minute and read my postings on Pile, because I think, Pile could help to decribe a complex system like software. I´d even say it shines when put to this task.

Using Pile to describe software systems

We´ve now reduced software to the most basic concept: associations. (But fear not, if you read up on Pile, you´ll see, your code and everything is not lost ;-) Software with arbitrary possible viewpoints can be viewed as a complex associative system with arbitrary granularity, with any number levels of abstractions encoded in it. They beauty of having only associations in the Pile way is, there is absolutely no redundancy, and there is natural poly-contextual interconnection of every "notion" without hitting the wall of any hierarchical system or notation.

Although the above picture simplyfies the Pile describing a software system (e.g. there are only two contexts, relations are not connected to detailing relations) I hope it becomes clear, how regular and homogeneous a pure associative description of software could be. There is no limit to the detail you can model using Pile associations. You can go right down to single bits without leaving the Pile model.

Since you probably have some XML and/or RDBMS background let me point out again the major differences between Pile and other data description models:

  • With Pile every artifact or structure or relationship already encoded in the system is reused.
  • With Pile connecting any artifact with any other is possible at all times. You need not think of that beforehand.
  • With Pile any relation can be refined in whatever way you like.

The essence of this is: If we´d use Pile instead of XML or an RDBMS we can start representing software in almost any way without painting ourselves into a corner. You could start by encoding (or assimilating, as it is called) just one hierarchy in a Pile, e.g. the software artifacts. Think of doing that in a very, very, very fine grained way, e.g. representing single letters with relations and building words and files etc. on top of those relations. Then your next idea might be to augment your system with logical categories. You assimilate a hierarchy of categories into the Pile and can associate any category with any level of detail of your software artifacts. You could even assign categories to single statements (if, for example, you represented you code as an abstract syntax tree). Or you assimilate the hardware target system into the Pile - down to single processors. You then could easily relate LAN sub-systems or machines or processes or even processors to software artifacts, e.g. mapping assemblies to run in a certain container process on a certain machine.

At any time you could regenerate your views and completely new views from the Pile by traversing the relations - and as long as you were careful to assimilate any information in a very fine grained way, you´d not even have to try to think of all the possible systems or contexts you might someday want to store in such a "database". At any time you can attach new relations to any relation there already is. If that´s not extensible, I don´t know what is ;-)

I find such a unifying, homogeneous, and inherently flexible model very intriguing for describing software. Instead of juggling an ever increasing number of tables in an RDBMS or trying to keep a growing amount of XML files in sync, you just manage one network of associations.

Posted by Ralf Westphal | with no comments
More Posts