Andres Aguiar's Weblog

Right here, right now

Stories

November 2004 - Posts

XAML Viewer for Whidbey

Gaston did it again. Now he built a great XAML Viewer for Whidbey

 

Ward Cunningham and Jack Greenfield on Software Factories

A conversation with Jon Udell

Software Factories, the JetBrains way

JetBrains' Sergey Dimitriev writes about "Language Oriented Programming: The Next Programming Paradigm".

They seem to have an interesting prototype working on the code-generation side. Also, the idea of combining the code generation target-language refactoring engine with the metadata editing tool is cool (for example, I can change the name of a property in the metadata and it will be changed in the code).

 

 

Handling Updates in Service Oriented Architectures

The title should be ‘updates in disconnected distributed applications’, but it will probably get more hits with ‘service oriented’ in the title ;).

I had an interesting conversation with one of the guys who coded Shadowfax about how to define messages in a SOA. It was clear for him that the right way was to write simple .net classes that mapped to the messages. That was the ‘clean’ solution.

I agreed with his point of view, but I wondered which was the right way to handle updates if you wanted to work with simple messages and types. I know that using DataSets (diffgrams really) works well, but I see the problems with diffgrams from an interop point of view (yes, it’s just XML but is not very friendly to not .NET platforms).

From a purist point of view, the order message should be as simple as possible, but to be able to handle updates correctly I need two kinds of ‘out of band’ information. First, if I want to use optimistic concurrency, I need the previous values or a timestamp (depending on the case one could be better than the other). Second, as the Order can have multiple lines, I need a way to know which lines were added, modified or deleted. Without this information I cannot have a generic way to handle updates.

On the other hand, getting optimistic concurrency exceptions is really a bad thing. If the service call is synchronous and the exception goes up to the end user, then she needs to deal with that message and he really does not have a good way to handle it (he’ll probably retry). If the call is asynchronous, then you need to redirect somewhere so it gets reviewed by someone/something.

Let’s see what kind of optimistic concurrency errors we can get.  Some interesting examples are discussed here.

There are some cases where the most reasonable thing to do is to take a ‘the last win’ approach. If two users change the name for a user, then the ‘last win’ approach seems reasonable. If one of the users gets a concurrency exception, he’ll probably confirm his changes anyway, so the last will win.

When the fields are involved in business rules (like the Item price) or when they are fields that have aggregated values (like an Item Inventory) then things are different, but we probably don’t need the values read from the database to perform the right action.

For example, if you want to update the Inventory for an item that is sold frequently, is very likely that you’ll get an optimistic concurrency exception, because the inventory in the database is probably different than the one you read. The fact is you really don’t care if the Inventory is different than the one you read. What you need to ensure is that there’s enough inventory to sell the item.  In this case, we don’t need the old value for the ItemInventory to handle it.

Another example could be if the Item price is different than the one I read when I save the invoice. If the price is less than the original, then there’s probably no reason to throw an exception. If it’s greater, then I should. But in this case I still don’t need the two copies of the ItemPrice. As the Invoice cannot change the Item Price, getting the value that was read is enough to check this.

Intuitively I think that in most cases (I’d say in every case but I don’t dare ;), using a ‘last-win’ approach  for ‘descriptive’ attributes and applying logic for attributes involved in business logic should work.

If this is true, the good news is that we don’t need the previous value or a timestamp field in the message to be able to save the field consistently, and we’ll greatly reduce the number of concurrency exceptions. The problem is that it requires thinking harder about the concurrency scenarios that could be present in the application. If not, you’ll probably get inconsistent data. It’s a dangerous approach. Having the old values or using a timestamp is a safer approach, and it works in most of the cases (unless you are in the high concurrency case).

The other problem we have with pure messages is that in hierarchical messages we don’t know what was added/changed/deleted. I don’t see a good way to handle this. Thinking outloud:

• I could delete all the child records and add the new ones. This has performance problems, and more important, deleting a row can trigger a cascading delete in a table that is not included in the message, and that row cannot be recreated.

• I can try to infer the state for each row:
    o Rows with an invalid primary key value are new rows (i.e, rows with negative ids)
    o Rows that exist in the database need to be updated (this requires checking if every line in the message exists in the database ).

So, I still don’t have an acceptable solution for this problem if I want to stick with ‘pure’ messages.

Of course this won’t be a problem if we don’t update the database. This seems to be the way that is used by some of the SOA gurus. Working with a database with only inserts solves this problem, but it adds significant complexity to the application (instead of joining by foreign key values you need to join by foreign key + date), and only works if the database was designed that way. Is not easy to apply this approach with existing databases and applications.

As a summary, if we stick with simple classes for the messages, we’ll need to:

• Add logic to handle potential concurrency problems that won’t be handled by optimistic concurrency
• Find a way to know what operation to apply to each row, or have a flag in each row indicating what happened to that row (and we don’t have simple classes anymore).
• Only insert in our database, adding a lot of work at the application level.

Another solution could be to build a(nother) WS* standard for serializing diffgrams and use them. Even if there are going to be scenarios where other solutions would be better, the diffgrams way makes everything easier. And that’s what we need.

 

 

RE: Doc to DB

Jimmy looks intrigued by a conversation Mats and I had and asks some questions. He already asked me to write something about it and I did not, but answering questions is much easier than writing a white paper on this, so this time I’ll do what he asks ;).

As a background, the idea is that if we could find an easy way to build applications that work well with SOA then we could apply that solution for building any kind of application, while keeping some of the SO advantages in not SO applications.

The idea is to work with services but instead of using a domain model in the middle, map the message to the database, while making sure that the business logic is executed in the right places.

Now I’ll address Jimmy’s questions:

> The borders of the docs will be static, right? Could that become a problem for flexibility?

Yes, they will be static. It can be a problem for flexibility if changing the doc structure is difficult. If not, when you need a new field in the doc you just add it. Note than in SO there’s no good way to have ‘dynamic docs’, so we need to make sure it’s easy to change the doc structure.

> Do you envision overlapping docs or not?

Yes. The idea is that ‘Customer’ has a different set of fields depending on the context, so they overlap. For example, in the Order service the customer has Id/Name but in the Customer service it has all the customer fields. This is, IMHO, one of the main problems that OO has when working with SO. In OO a Customer is the same thing everywhere. In SO, it’s not.

> Would you recommend the solution for in-proc situations too?

Yes, if not, we are just adding another mapping layer, and that’s why we try to avoid.

> Do you see any drawbacks with behavior-less classes?

If we manage to find generic ways to solve all the behavior problems, then this problem does not exist. As we probably won’t (there will always be border cases that we won’t be able to handle), we’ll need to organize well the code around the ‘docs’.

> Will there be many places to change if there is a change in the database? Or do you envision some central model which will be the only place to change? If so, how do you envision the implementation of that model?

You can do a doc-to-db mapper, and that’s one of the parts of the problem that we already solved with DeKlarit.

> There won't be anything like an Identity Map or a Unit of Work across several docs, right?

No

> How is this better than DataSets? How is it worse?

It has no relation with it ;). You can do this with or without DataSets. If you are asking how different is this from using the DataSets in the ‘VS.NET’ way, then I’ll say that is very different, because the doc structure does not map to database tables, so your DataSets don’t break when your db schema changes.  That’s the ‘why is better’. I don’t see how it’s worse. Also, we are not talking about using IDbDataAdapters, but having a generator or a runtime mapper that knows how to load the dataset from the database, with a mapping defined with metadata somewhere.

> If the client's platform is unknown, what to give the client then? Validation stuff anyway? If so, in what format?

Let’s say we manage to express validations in metadata and we can know which validations to apply for each document. We can build generators/interpreters for that metadata in any platform (javascript, t-sql, c#/java, etc). The format can be a custom language, xml, a javascript subset, etc.

> How do we ask for a certain doc? By an enum (or similar) parameter? Or many fine grained services so the client can aggregate the doc by itself? Or only a few coarse grained services?

Only a few coarse grained services, like OrderAdapter.GetOrder(OrderRequest request)/ OrderAdapter.UpdateOrder(Order). 

>Regarding global rules (rules that require more than a single doc), will they go to the database? Or to a global doc? Or to what?

Global rules are rules that can be applied to more than one single doc (they don’t ‘require more than a single doc’). You express them in metadata. In DeKlarit it would be something like:

error(NotEnoughInventoryException, “Not enough inventory”) if Item.Inventory < 0

Then the framework should be able to know in which documents it should apply this rule (all the documents which could cause the Item.Inventory to be decreased), and generate code in those ‘document adapters’, or it could generate a trigger in the database. It could also generate code in the presentation layer for the rules that don’t require reading the database to be evaluated. We don't do this in DeKlarit (yet ;).

> Would you agree if I say that a doc sounds similar to what Martin Fowler talks about in the Presentation Model pattern?

I’m not sure ;), Fowler does not focus much in the document there, but I can certainly see how having this document would make applying that pattern easy. In addition to playing well with SOA, the documents play well with UIs.

DTOs & Domain Models

Mats has an interesting post on this.

 

Local DTOs & Fowler

Fowler has some comments about the Jon Tirsen's post that I commented a while ago.

I don't see how his approach can work unless you don't do any client-side validation. If you don't design with distribution in mind, you will couple your validation logic with your domain model, and then you can't validate in the client or you'll need to duplicate logic.

In a comment I got in my previous post someone said that adding javascript validations in the web layer is not that bad, and he was probably right. That code duplication is probably unavoidable unless you encode your validations in metadata and generate code both for javascript and the target domain logic language.

I guess that Fowler, as a mostly-Java guy, is probably involved mainly in web applications, where client-side javascripts are something people is used to do and distribution is barely needed (and if it's needed, is for application integration where there's no client-side validation). But for rich client applications, I don't think noone will be happy to rewrite the validations in Java or C#.

This is one of the reasons I think Microsoft should not push an architecture based on a 'pure' domain model. It will make building smart client applications harder.

 

 

 

More Posts