High-performance XML (V): in-memory XML Schema validation without re-parsing

Friday, June 25, 2004

Note: this entry has moved.

I may have not stressed enough one of the most important features enabled by the XPathNavigatorReader: in-memory (without reparsing) XML Schema validation of arbitrary sources exposed as XPathNavigator.

When XML editing is required, developers typically resort to OuterXml->new XmlTextReader->new XmlValidatingReader->Validate (and re-parse!):

XmlDocument doc = GetModifiedDocument(); // Get the modified doc somehow. // Create the reader from the XML string taken through OuterXml. XmlValidatingReader vr = new XmlValidatingReader( new XmlTextReader(new StringReader(node.OuterXml)));

There is an absolutely unnecessary re-parsing step that degrades performance. The same scenario can be solved trivially with the XPathNavigatorReader:

XmlDocument doc = GetModifiedDocument(); // Get the modified doc somehow. // Create the validating reader with the new reader over the root document navigator XmlValidatingReader vr = new XmlValidatingReader( new XPathNavigatorReader(doc.CreateNavigator()));

That "simple" change completely bypases the need to re-parse the document. Needless to say, the bigger the document, the higher the cost. In my tests with a fairly small document (~50kb) I could save about 30-40% processing time. And if you use an XPathDocument instead, the processing saving skyrockets to more than 60%! As usual, this shows the superiority of the XPathDocument as a generic XML in-memory store. I can't wait for Whidbey release, when it will offer all of XmlDocument features and more.

As I explained in my previous post, there's another interesting story for the XPathNavigatorReader, and that's about document fragment validation. As the reader considers the navigator's current position as the root node, you can validate a subset against a refined schema. Specially with complex documents and schemas, this can significantly improve performance too.

The full project source code can be downloaded from SourceForge .

Enjoy and please give us feedback on the project!

Check out the Roadmap to high performance XML.

No Comments