April 2005 - Posts

Visiting SQL - Update
It seems like I'm not the only one who has been playing with this idea: sqlom
Visiting SQL - Part 2
In part 1 I explained how I got a "typed model" of a database schema for building SQL in code. By itself I found this useful, but I still hated the string concatenation involved (not to speak of the scattering of all this string manipulation). So I came up with the idea to build a hand-made object model for all SQL query related objects. This model would allow me to build SQL statements in a typed and database-agnostic fashion, integrating nicely with my typed database schema model. To give you a rough idea I included some schematics:

- The core interfaces -

- The core classes (and some derived ones to give you an idea) -


The main interfaces of interest are ISqlSerializable (a.k.a. IVisitable with the accept method replaced with the more appropriate WriteSql) and ISqlWriter (a.k.a. the Visitor). Basically, I've taken the visitor pattern and applied it to writing SQL, allowing for extensibility (
for each of the SQL dialects out there) and refraining developers (in theory) from polluting the code with string concatenated SQL. The not so apparent added value is the amount of reuse one can achieve with this type of model (from caching to creating dynamic views).
The only drawback (maybe there are more, but this is the major one) to using an object model is that your SQL becomes very much unreadable as complexity grows. Then again, this may be a sign that you're not doing enough in your domain model or (e.g. for performance, security reasons) it might be wise to push this logic into a stored procedure. But even this can be solved, by simply modelling each query statement as a class, which can be easily serialized into SQL to see if you get the desired SQL statement (the truly brave might even parse a SQL string and reverse engineer it into the object model).
I think there's enough food for thought in here. Sorry if I haven't gone into too much detail, but a high-level overview is what I had in mind. Feel free to inquire for more.
Visiting SQL - Part 1
Apparently, after some months of intensive work, I'm dieing to share (and reflect on) some of the things I've worked on.

Although "pluggable database backends" and "SQL abstraction" were not directly goals of the project (regular OLTP app), I kind of stumbled on them by accident (just don't call me nuts yet). First off, a lot has already been written about "pluggable database backends", achieved using the IDbXYZ and IDataZYX interfaces which any decent .NET data provider implements, so I'm not gonna talk about that.
At work, I tend to use a lot of dynamic SQL because I don't want too much of my business logic "locked" in stored procedures. The whole dynamic SQL versus stored procedures topic is not something I want to delve in now, so we're not gonna talk about that either. Dynamic SQL built in code suffers the problem that as the database schema changes, the "SQL building code" should change along with it. So how do you know which SQL building code is affected by changes in the database schema? If the SQL was somehow typed, the compiler would tell you. Making the SQL typed would mean that there should be some object model in place that, for starters, allows you to represent the database objects (think views, tables, columns, ...).
Datasets, and in particular the typed flavor, come to mind. But they're a little bloated just to represent a schema
(and probably limited as to the types of database objects they can represent). I had a look at CodeSmith's SchemaExplorer (which BTW has a slick object model) but couldn't use it because it's use was primarly to discover the database schema at runtime. What I was looking for was a static definition of the database schema in C# code at design time (which I dubbed RelationalSchema). Although I'm not a big proponent of DIY, I really don't think I had much choice at the time. I borrowed some of SchemaExplorer's object model (in redux I might add), and made it xml serializable using one of those fine WSE interfaces, namely IXmlElement(who said you can't reuse wonderful material).
Now that RelationalSchema was born, containing RelationalSchemaDatabase, RelationalSchemaTable, RelationalSchemaView and RelationSchemaColumn, I used CodeSmith to build a template that generates my xml representation of a relational schema. This way I could have an xml representation of any database schema I liked; just a matter of setting the proper connectionstring. Still, this was just a "generic" representation of a database schema. I still wouldn't detect database schema changes this way. Much like datasets definitions can be used to generate typed datasets, I needed a way to generate a typed relational schema. Again, CodeSmith to the rescue. I made a template which creates a typed object model based on the generic object model.
If I was planning on using the "building SQL using strings" approach than my story would be done here. But it isn't...

The "facets-enabled" meta model
During the course of my current project at work, I've been trying to come up with a way to describe the meta data of my domain model (which I dubbed the meta model).  Because I've taking a rather ObjectSpaces-like approach to object-relational mapping, the domain schema I created seemed like a good place to store the facets of my class members(fields and/or properties). Here's an example of what it looks like:
<?xml version="1.0" encoding="UTF-8"?>
<
DomainSchema AssemblyFullName="SomeProject, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null"
xmlns="http://www.seagile.com/domainschema-1.0/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation
="http://www.seagile.com/domainschema-1.0/
D:\WorkSpace\SomeProject\trunk\Schemas\DomainSchema.xsd"
>
    <Classes>
        <Class FullName="Seagile.SomeProject.SomeBaseClass" IdentityMember="ID">
            <Members>
                <Member Name="ID" Field="_id" Type="System.Guid"/>
            </Members>
        </Class>
        <Class FullName="Seagile.SomeProject.SomeClass" BaseClass="Seagile.SomeProject.SomeBaseClass">
            <Members>
                <Member Name="Name" Field="_name" Type="System.String">
                    <Facets>
                        <MinLength>1</MinLength>
                        <MaxLength>5</MaxLength>
                    </Facets>
                </Member>
                <Member Name="ExternalCode" Property="ExternalCode" Type="System.String"
                    Required="False" Default="MCQ000">
                    <Facets>
                        <FixedLength>6</FixedLength>
                        <Pattern>^MCQ\d{3}$</Pattern>
                    </Facets>
                </Member>
                <RelationshipMember Name="Category" Field="_category" RelationshipType="One"
                    TargetClass="Seagile.SomeProject.SomeOtherClass"/>
                <RelationshipMember Name="Materials" Field="_materials" RelationshipType="Many"
                    TargetClass="Seagile.SomeProject.Material"/>
            </Members>
        </Class>
        <Class FullName="Seagile.SomeProject.Material" BaseClass="Seagile.SomeProject.SomeBaseClass">
            <Members>
                <Member Name="Name" Field="_name" Type="System.String"/>
                <Member Name="TotalParts" Field="_totalParts" Type="System.Int32">
                    <Facets>
                        <MinInclusive>2</MinInclusive>
                        <MaxExclusive>11</MaxExclusive>
                    </Facets>
                </Member>
                <Member Name="BarCode" Field="_barCode" Type="Seagile.SomeOtherProject.BarCode, SomeOtherProject"
                    TypeConverter="Seagile.SomeOtherProject.Data.BarCodeConverter, SomeOtherProject.Data"/>
            </Members>
        </Class>
    </Classes>
</DomainSchema>
By itself the schema is worth nothing, but in conjunction with a mapping and relational schema it's "easy" for a generic mapping layer to interpret how to retrieve and store objects. Strictly speaking the mapping layer doesn't need the type information and facets. But their presence is very much appreciated by the domain, application and presentation layers. The presentation layer can use the information to constrain the input fields, and both the domain and application layer can feed these facet values to validators (LengthValidator, RegularExpressionValidator, Int32RangeValidator, ...).
The xml representation is both a blessing and a curse. On one side, it's very easy to get up and running (using an xml schema to drive the intellisense and validation). It feels like a modelling tool if the schema is what you start out with. If the domain model is already in place it's a pesky task. Maintenance wise, however, it's a little devil. Having a validator to compare the declarative xml representation against an assembly is not a luxury. Keeping the code and xml in sync is asking for trouble. I think the best way to handle this is (a) to forget this whole idea, (b) have the xml representation generate the code or (c) have the code generate (attribute-based approach) the xml representation.
But lets not forget the problem I'm trying to solve, namely preventing the scattering of the domain's meta data. Where do you store the fact a column is 5 long, how do you constrain the textbox representing that same conceptual value to only take 5 characters to have an overal nice end-user experience, and how about duplicating that value once more in the domain layer so that other "clients" of your code play by the rules? How do you deal with change and know where to look if 5 becomes 10?
Collection classes and behavior
Lately I've come up with a technique to implement custom finders/queries/filters on my collection classes. Instead of implementing these finders/queries/filters directly in the collection class itself, which gets in the way of generating these types of classes anyway (for the record, I am aware of techniques like CodeSmith's merge regions), I defined a set of new behaviors:
/// <summary>
/// Behavior for things that can be evaluated.
/// </summary>
public interface IPredicate {
    /// <summary>
    /// Evaluates the specified obj.
    /// </summary>
    /// <param name="obj">Object to be evaluated.</param>
    /// <returns><c>true</c> if obj meets the predicate; otherwise <c>false</c>.</returns>
    bool Evaluate(object obj);
}


/// <summary>
/// Behavior for collections from which items can be extracted using the given <see cref="IPredicate"/>.
/// </summary>
public interface IExtractable {
    /// <summary>
    /// Extracts the items from the collection given the specified predicate, leaving only the items that didn't meet the predicate.
    /// </summary>
    /// <param name="predicate">Predicate used to evaluate items against.</param>
    /// <returns>A collection of items that meet the given predicate.</returns>
    object Extract(IPredicate predicate);
}


/// <summary>
/// Behavior for collections that can be filtered using a given <see cref="IPredicate"/>.
/// </summary>
public interface IFilterable {
    /// <summary>
    /// Filters the items in the collection given the specified predicate, effectively removing the items that match the predicate.
    /// </summary>
    /// <param name="predicate">Predicate used to evaluate items against.</param>
    void Filter(IPredicate predicate);
}


/// <summary>
/// Behavior for collections from which items can be selected using the given <see cref="IPredicate"/>.
/// </summary>
public interface ISelectable {
    /// <summary>
    /// Selects the items in the collection given the specified predicate but leaves all items in the collection.
    /// </summary>
    /// <param name="predicate">Predicate used to evaluate items against.</param>
    /// <returns>A collection of items that meet the given predicate.</returns>
    object Select(IPredicate predicate);
}
Next I implemented the ISelectable, IExtractable, IFilterable in my collection class template (regardless of the templating mechanism you use). I even provided typed implementations of these methods (using explicit interface declaration).
Foreach rule/condition I've come across in code, I've created a class implementing IPredicate. Feeding instances of my predicate classes to my collection classes allows me to perform either select, extract or filter operations on them. I've even taking this a step further and implemented a CompoundPredicate which basically is a collection of predicates and a predicate decision combinator (IPredicateDecisionCombinator). When a compound condition is passed to one of the select/extract/filter methods, multiple conditions are evaluated at once and the way the result of these evaluations are combined is the concern of the predicate decision combinator (so you can vary from the simple boolean and/or).
There's some room for improvement, namely the Select and Extract could return ICollection or IList instead of object, but that's a matter of taste to me.
The beauty of this solution is that the condition (predicate) is totally separate from the function
(extracting, filtering, selecting) I want to perform on the collection. Conceptually it is quite similar to Predicate<T> in .NET 2.0 or to IComparer and the Sort method in .NET 1.1 and to some extent to closures. Ofcourse in 1.1 it requires a little more work on your behalf, namely passing around contextual data via the constructor of the predicate whereas in 2.0 anonymous methods will do most of the lifting for you.
More Posts