High-performance XML (V): in-memory XML Schema validation without re-parsing
Note: this entry has moved.
I may have not stressed enough one of the most important
features enabled by the
XPathNavigatorReader: in-memory (without reparsing) XML Schema validation of
arbitrary sources exposed as
XPathNavigator.
When XML editing is required, developers typically resort to OuterXml->new XmlTextReader->new XmlValidatingReader->Validate (and re-parse!):
There is an absolutely unnecessary re-parsing step that degrades performance. The same scenario can be solved trivially with the XPathNavigatorReader:
That "simple" change completely bypases the need to re-parse
the document. Needless to say, the bigger the document, the
higher the cost. In my tests with a fairly small document
(~50kb) I could save about
30-40% processing time. And if you use an
XPathDocument instead, the processing
saving skyrockets to more than 60%! As
usual, this shows the superiority of the
XPathDocument as a generic XML in-memory store.
I can't wait for Whidbey release, when it will offer all of
XmlDocument features and more.
As I explained in my previous post, there's another interesting story for the
XPathNavigatorReader, and that's about document
fragment validation. As the reader considers the navigator's
current position as the root node, you can validate a
subset against a refined schema. Specially with complex
documents and schemas, this can significantly improve
performance too.
The full project source code can be downloaded from SourceForge .
Enjoy and please give us feedback on the project!
Check out the Roadmap to high performance XML.