High-performance XML (V): in-memory XML Schema validation without re-parsing
Note: this entry has moved.
I may have not stressed enough one of the most important features enabled by
the XPathNavigatorReader: in-memory
(without reparsing) XML Schema validation of arbitrary sources exposed as XPathNavigator.
When XML editing is required, developers typically resort to OuterXml->new XmlTextReader->new XmlValidatingReader->Validate (and re-parse!):
There is an absolutely unnecessary re-parsing step that degrades performance. The same scenario can be solved trivially with the XPathNavigatorReader:
That "simple" change completely bypases the need to re-parse the document.
Needless to say, the bigger the document, the higher
the cost. In my tests with a fairly small document (~50kb) I
could save about 30-40% processing time. And if you use
an XPathDocument instead, the processing saving
skyrockets to more than 60%! As usual, this shows the superiority of
the XPathDocument as a generic XML in-memory store. I can't
wait for Whidbey release, when it will offer all of XmlDocument features and
more.
As I explained in
my previous post, there's another interesting story for the XPathNavigatorReader,
and that's about document fragment validation. As the reader considers the
navigator's current position as the root node, you can
validate a subset against a refined schema. Specially with complex documents
and schemas, this can significantly improve performance too.
The full project source code can be downloaded from SourceForge .
Enjoy and please give us feedback on the project!
Check out the Roadmap to high performance XML.