XmlNodes from XPathNodeIterator
Note: this entry has moved.
Every now and then I receive complains about
public XmlNodeList SelectNodes(string xpath);
public XmlNodeList SelectNodes(string xpath, XmlNamespaceManager nsmgr);
You know, it allows iteration where each
Current element is an
Not too useful if you're looking for
OuterXml, or are
too-dependant on the XmlNode-based API (i.e.
most worrying issue is that people use this argument against using compiled
XPath expressions, which are known to significantly improve performance (see
Performant XML (I) and
Performant XML (II) articles). The reason is that in order to get an
XmlNodeList, you have to use the SelectNodes method of the XmlNode (and
therefore XmlDocument), whose signature is as follows:
This means that most developers won't compile their expressions simply because
in order to use the
// Statically compile and cache the expression.
// Init and load a document.
// Create navigator, clone expression and execute query.
XPathNodeIterator it = document.CreateNavigator().Select(expr.Clone());
// Do something with it.Current which is an XPathNavigator.
XPathExpression, they have to explicitly
create a navigator for the node/document and work against the cursor-style API
This approach generally means that in order to optimize the code by compiling
expression, you actually have to refactor significant pieces of your code. And
you don't have any other choice if you need to sort the query by using
There's a solution to this problem, as usual :).
You know that the
if (it.Current is IHasXmlNode)
XmlNode node = ((IHasXmlNode)it.Current).GetNode();
// Work with your beloved DOM api ;)
XPathNavigator is an abstract class that allows
multiple underlying implementations to offer the same cursor-style API and gain
the instant benefit of XPath querying.
Aaron Skonnard has some interesting implementations showing this
concept. Therefore, when you're iterating the results of the query, and asking
for the current element, you're actually using something that is dependant on
the implementation. Therefore, this object, besides being an XPathNavigator
(that is, the XPathNodeIterator.Current property), can also implement other
interfaces as part of the underlying implementation. As such, queries executed
against an XmlNode-based element will have each Current element implementing
XPathDocument-based ones will implement
And what is this useful for? Well, just to get access to additional information
beyond the standard
XPathNavigator API that depends on the
concrete implementation. So, inside the while look above, we can ask:
Still, this doesn't solve the problem that you have to iterate diffently than
you're used to, and that significant rewrites are still needed when you use
XPathNodeIterator it = doc.CreateNavigator().Select(expr.Clone());
XmlNodesEnumerable nodes = new XmlNodesEnumerable(it);
foreach (XmlNode node in en)
// Or use the array directly:
XmlNode list = nodes.ToArray();
The solution is to use the knowledge about the underlying implementation (i.e.
you KNOW you're querying against an
XmlDocument) and get an easier
API to it. This can be achieved by creating an
that provides iteration ofer the
XPathNodeIterator but exposing
XmlNode. Also, a helper method returning an
array of XmlNodes is useful. It would be used as follows:
Complete code for the custom enumerable object and its internal enumerator
+ Collapsible code listing.
Update: check an even better approach here.
Check out the Roadmap to high performance XML.