What are the methods of XPath

XML in PowerShell: XPath Queries and Namespaces

XPath, a query language standardized by the W3C, is used to extract information from XML documents. With it you can perform much more complex operations than with the dot notation of PowerShell. XPath becomes cumbersome in interaction with namespaces.

XML Path Language (XPath) allows elements to be selected depending on attribute values, sibling nodes to be addressed using position parameters, or navigation using a shortened syntax and wildcards (see the XPath examples at W3C).

XPath in methods of XMLDocument

PowerShell also allows for the use of XPath expressions, through the methods SelectNodes () and SelectSingleNode (). They are in objects of the type System.Xml.XMLDocument and System.Xml.XMLElement to disposal. The first of the two returns all nodes to which the query applies, the second only the first occurrence.

For the following explanations, we use a fragment from an .ovf file, which is used to exchange VMs across platforms:

After you have the file with

read in, you could all Item-Address elements like this:

In practice today, almost all XML documents contain at least one namespace declaration; the OVF format used here as an example contains several:

In this case, every call to SelectSingleNode () or SelectNodes () lead to an empty result without comment according to the above pattern.

The remedy here is to create a XmlNamespaceManagerObject that you have to pass when calling the two functions:

The first command creates the object and the second adds the default namespace. It is interesting that the latter does not define a prefix, but we do for that XmlNamespaceManager need one. This example uses "ns", which we then need to refer to elements or attributes that have no prefix:

As you can see here, the XPath expression now uses the prefix for the default namespace and we pass the one defined above as the second argument XmlNamespaceManager. For a complete availability of all elements one would of course have to add all namespaces.

If you want to save yourself the necessary typing, then you can download this script from Github. It reads out all namespace definitions and creates a new one from them XmlNamespaceManager.

XPath queries with Select-Xml

For XPath queries, PowerShell offers Select-Xml, its own cmdlet, which allows simplified access to XML files. These do not have to be read explicitly into an XML object, rather the path to the file is simply passed to it via the parameter Path:

Since the OVF file contains several namespace declarations, this call also does not produce any result, quite succinctly. However, the solution here is not to define one XmlNamespaceManager, but you create a hash table according to the pattern

and transfers this to the parameter Namespace:

As you can see here, the prefix "ns" refers again to the default namespace. You can also choose any other prefix in the hash table, which you then of course also have to use in the XPath query.

A specialty of Select-Xml is that it returns an array made up of nodes, the path, and the xpath expression. To unlock the content of the nodes, use Select-Object:

Alternatively, you can use the dot operator to access the Node property:

The advantage of this approach is that you can get an object of type XmlElement and can use all of its properties and methods. This also includes, for example, the output of the mere text nodes InnerText or the subtree it contains with InnerXML: