Subset document loading and transformation with XPathNavigatorReader
Note: this entry has moved.
Thanks to Tom Smalley who pointed it, I fixed a bug that
prevented the
XPathNavigatorReader
from being used for loading a new
XmlDocument or XPathDocument. This
feature is very useful if you have to apply a transformation
to a subset of a document. For this purpose, the
MSDN documentation
suggests using XmlDocument both for the entire
document loading as well as the subset, which is the most
inefficient way of performing transformations in .NET.
The code suggested for this scenario is (I modified it to print each child of the root which is more useful):
xslt.Load("print_root.xsl");
// Load the entire doc.
XmlDocument doc = new XmlDocument();
doc.Load("library.xml");
// Create a new document for each child
foreach (XmlNode testNode in doc.DocumentElement.ChildNodes)
{
XmlDocument tmpDoc = new XmlDocument();
tmpDoc.LoadXml(testNode.OuterXml);
// Transform the subset.
xslt.Transform(tmpDoc, null, Console.Out, null);
}
Note that there's double parsing for each node to be
transformed as the temporary document is loaded from the raw
string returned by the OuterXml property. With
the XPathNavigatorReader you can avoid this
parsing cost altogether, and work with the XSLT-optimized
XPathDocument using the following code:
// Always pass evidence!
xslt.Load("print_root.xsl", null, this.GetType().Assembly.Evidence);
// Load the entire doc.
XPathDocument doc = new XPathDocument("library.xml");
// Create a new document for each child
XPathNodeIterator books = doc.CreateNavigator().Select("/library/book");
while (books.MoveNext())
{
// Load a doc from the current navigator using a reader over it.
XPathDocument tmpDoc = new XPathDocument(
new XPathNavigatorReader(books.Current));
// Transform the subset.
xslt.Transform(tmpDoc, null, Console.Out, null);
}
Note that XML parsing happens only once, when the full doc
is loaded. For a dsPubs database dump relatively large
(300Kb), and a little less trivial stylesheet, the later
approach yields 2X performance increase (you already know
you gain about 30% from using
XPathDocument alone).
XPathNavigatorReader is part of the opensource
Mvp.Xml project. The full project source code can be downloaded from
SourceForge.
Enjoy and please give us feedback on the project!
Update: this technique does incur the cost of an additional parse step. Check High-performance XML (III): subtree transformations without re-parsing for a better approach.
Check out the Roadmap to high performance XML.