Note: this entry has moved.
Today I received a notification from eBay that they will be finally removing support for .NET Passport on January 24, 2004. One big non-MS customer leaves the field... how many more will follow now?
Note: this entry has moved.
On a project I'm working on for
MS Patterns & Practices, we were excited about the
XInclude spec becoming a recomendation and anxious about using it in our project, which is quite heavy on XML usage for configuration. The modularization that could be introduced transparently by XInclude was very compelling. Even if it
will hardly be implemented in .NET v2, we could still take advantage of the
Mvp.Xml project
XInclude.NET implementation
Oleg did. But apparently it's almost impossible to use XInclude and XSD validation together :(
The problem stems from the fact that XInclude (as per the spec)
adds the xml:base attribute to included elements to signal their origin, and the same can potentially happen with xml:lang. Now, the W3C XML Schema spec says:
Validation Rule: Element Locally Valid (Complex Type)
...
3 For each attribute information item in the element information item's [attributes] excepting those whose [namespace name] is identical to http://www.w3.org/2001/XMLSchema-instance and whose [local name] is one of type, nil, schemaLocation or noNamespaceSchemaLocation, the appropriate case among the following must be true:
And then goes on to detailing that everything else needs to be declared explicitly in your schema, including xml:lang and xml:base, therefore :S:S:S.
So, either you modify all your schemas to that each and every element includes those attributes (either by inheriting from a base type or using an attribute group reference), or you validation is bound to fail if someone decides to include something. Note that even if you could modify all your schemas, sometimes it means you will also have to modify the semantics of it, as a simple-typed element which you may have (with the type inheriting from xs:string for example), now has to become a complex type with simple content model only to accomodate the attributes. Ouch!!! And what's worse, if you're generating your API from the schema using tools such as xsd.exe or the much better
XsdCodeGen custom tool, the new API will look very different, and you may have to make substancial changes to your application code.
This is an important issue that should be solved in .NET v2, or XInclude will be condemned to poor adoption in .NET. I don't know how other platforms will solve the W3C inconsistency, but
I've logged this as a bug and I'm proposing that a property is added to the
XmlReaderSettings class to specify that XML Core attributes should be ignored for validation, such as XmlReaderSettings.IgnoreXmlCoreAttributes = true. Note that
there are a lot of Ignore* properties already so it would be quite natural.
Please
vote the bug if you feel it's important, and better yet, think of a better solution to it!!! ;)
Note: this entry has moved.
Regular expressions are really powerful and very cool. Most people think of them as just a validation mechanism. They are missing a big scenario enabled by regexes: parsing.
Some other people think that if you're doing any parsing, you **have** to use parser generator tools (i.e. yacc/lex, antlr, coco/r, etc), build a formal grammar of your language, etc. But do you really **need** to get into that? Do you want proof that you can achieve the same goal with regular expressions? The ASP.NET page parser is built with regular expressions, and not only the v1.x, but the Whidbey version too.
Wanna confirm? Fire up Reflector, search for the TemplateParser class in the System.Web.UI namespace, and look at the ParseStringInternal method. There you will see how the BaseParser class is being used to parse the page source, which contains all the regular expressions for the several pieces of a page.
I've build a number of parsers with regexes, from simple expression parsers (i.e. a more flexible and powerful expression format than DataBinder.Eval, for example) to full template file parsing (i.e. templates with ASP-like syntax for codegen, in the spirit of CodeSmith, NVelocity, etc.). And it works very well. And your code using very complex regular expressions doesn't have to be a cryptic-impossible to read-never ending-line of almost garbage that only you can understand.
Bottom-line: learn regular expression. There're a lot of very real problems that you can solve SO easily with them...