High-performance XML (V): in-memory XML Schema validation without re-parsing
Note: this entry has moved.
I may have not stressed enough one of the most important features enabled by
the XPathNavigatorReader: in-memory
(without reparsing) XML Schema validation of arbitrary sources exposed as XPathNavigator
.
When XML editing is required, developers typically resort to OuterXml->new XmlTextReader->new XmlValidatingReader->Validate (and re-parse!):
There is an absolutely unnecessary re-parsing step that degrades performance. The same scenario can be solved trivially with the XPathNavigatorReader:
That "simple" change completely bypases the need to re-parse the document.
Needless to say, the bigger the document, the higher
the cost. In my tests with a fairly small document (~50kb) I
could save about 30-40% processing time. And if you use
an XPathDocument
instead, the processing saving
skyrockets to more than 60%! As usual, this shows the superiority of
the XPathDocument
as a generic XML in-memory store. I can't
wait for Whidbey release, when it will offer all of XmlDocument features and
more.
As I explained in
my previous post, there's another interesting story for the XPathNavigatorReader
,
and that's about document fragment validation. As the reader considers the
navigator's current position as the root node, you can
validate a subset against a refined schema. Specially with complex documents
and schemas, this can significantly improve performance too.
The full project source code can be downloaded from SourceForge .
Enjoy and please give us feedback on the project!
Check out the Roadmap to high performance XML.