W3C XML Schema and XInclude: impossible to use together???

Note: this entry has moved.

On a project I'm working on for MS Patterns & Practices, we were excited about the XInclude spec becoming a recomendation and anxious about using it in our project, which is quite heavy on XML usage for configuration. The modularization that could be introduced transparently by XInclude was very compelling. Even if it will hardly be implemented in .NET v2, we could still take advantage of the Mvp.Xml project XInclude.NET implementation Oleg did. But apparently it's almost impossible to use XInclude and XSD validation together :(

The problem stems from the fact that XInclude (as per the spec) adds the xml:base attribute to included elements to signal their origin, and the same can potentially happen with xml:lang. Now, the W3C XML Schema spec says:

3.4.4 Complex Type Definition Validation Rules

Validation Rule: Element Locally Valid (Complex Type)
...

3 For each attribute information item in the element information item's [attributes] excepting those whose [namespace name] is identical to http://www.w3.org/2001/XMLSchema-instance and whose [local name] is one of type, nil, schemaLocation or noNamespaceSchemaLocation, the appropriate case among the following must be true:

And then goes on to detailing that everything else needs to be declared explicitly in your schema, including xml:lang and xml:base, therefore :S:S:S.

So, either you modify all your schemas to that each and every element includes those attributes (either by inheriting from a base type or using an attribute group reference), or you validation is bound to fail if someone decides to include something. Note that even if you could modify all your schemas, sometimes it means you will also have to modify the semantics of it, as a simple-typed element which you may have (with the type inheriting from xs:string for example), now has to become a complex type with simple content model only to accomodate the attributes. Ouch!!! And what's worse, if you're generating your API from the schema using tools such as xsd.exe or the much better XsdCodeGen custom tool, the new API will look very different, and you may have to make substancial changes to your application code.

This is an important issue that should be solved in .NET v2, or XInclude will be condemned to poor adoption in .NET. I don't know how other platforms will solve the W3C inconsistency, but I've logged this as a bug and I'm proposing that a property is added to the XmlReaderSettings class to specify that XML Core attributes should be ignored for validation, such as XmlReaderSettings.IgnoreXmlCoreAttributes = true. Note that there are a lot of Ignore* properties already so it would be quite natural.

Please vote the bug if you feel it's important, and better yet, think of a better solution to it!!! ;)


2 Comments

  • Yeah, that a painful stuff. Just voted.

  • This is another case that highlights the differences between the people with a markup mentality (that free annotatability is the architectural core of XML: i.e., extensibility) and the people with a data storage mentality (that the purpose of schemas is to let you store your documents as efficiently as possible.)



    A markup person would say "Of course you want to be able to add any attribute in any other namespace if you want to" while a storage person would say "Yikes, we cannot allow that every simple type could take extra attributes: think of the storage costs!"



    The idea of xml:lang is that natural language data

    always belongs to a language: if you like, that language is a type-facet of prose. XML built this awareness in, and XML Schemas pushed it out in order to cater to people with the data storage requirement or, at least, that mentality.



    Another way to look at this XInclude issue is that it shows the gap our current standards have to deal with inherited data values, for example an attribute value on a parent whose efectivity reaches its descendents.

Comments have been disabled for this content.