Introduction
This article aims to explain what an XPath expression is and why they can be extremely useful to a C# programmer.
Background
When I first started .NET programming, I was immediately exposed to the use of XML. It was everywhere, and I had hardly any exposure to it previously. After understanding why XML documents were used so widely, I took the decision to incorporate XML files into my next application. So off I went, added the System.Xml
namespace to my project, and created an XmlDocument
object. I called the LoadXml()
method and passed in a valid XML string. "Great!" I thought, but soon discovered I had no idea how I would get the data I wanted out of the XmlDocument
object.
Using XPath in C#
Throughout this guide, I will refer to the following XML file:
="1.0"="utf-8"
<books>
<book>
<title>A beginners guide to XPath</title>
<author>Gary Francis</author>
<description>A book that explains XPath for beginners</description>
<data type="Price">12.00</data>
<data type="ISBN">1234567890</data>
</book>
<book>
<title>Advanced C# Programming</title>
<author>A. Uther</author>
<description>Advanced applied C# techniques.</description>
<data type="Price">47.00</data>
</book>
<book>
<title>Understanding C# for beginners</title>
<author>Any body</author>
<description>How to get started with C# and .NET</description>
<data type="Price">12.00</data>
<data type="Comment">This was a great book... It helped Loads.</data>
<data type="Comment">Excellent material if you new to C#.</data>
</book>
</books>
The above XML file contains information about books. As you can imagine, this file could be a lot more complex, but for the sake of simplicity, we will leave it like this.
Before we can do anything useful with this data, we need to create an XmlDocument
object and load the data into it. From within Visual Studio, create a new C# Windows Forms application. On the form, drop a button, and double click the button to bring up its event handler. The following code loads the XML data from a file:
XmlDocument document = null;
XmlNodeList nodeList = null;
XmlNode node = null;
try
{
document = new XmlDocument();
document.Load("Data.xml");
}
catch (Exception ex)
{
MessageBox.Show("Error loading 'Data.xml'. Exception: " + ex.Message);
}
In order for the above code to compile, you will need to reference the System.Xml
namespace at the top of your code file. This example also assumes that you have a file called "Data.xml" in the output directory of your project. The easiest way to do this is to add a new XML file into your project, insert the above XML data into the file, and name the file Data.xml. Finally, you should set the "Copy To Output Directory" property of the XML file to "Copy Always".
Now that we have successfully loaded the data, we need to reference some data. To do this, we will use an XPath expression. The data we are going to try and retrieve initially is a NodeList
of all book
elements within the XML file. The XPath expression to achieve this is:
/books/book
Don't worry too much that you don't know what this means as all will be revealed shortly. So we add the following code to the method:
nodeList = document.SelectNodes("/books/book");
This will populate our XmlNodeList
will all of the book
elements. Well, within each of these elements, we know there is going to be a <title>
element. So we can use another XPath query to access that. The following code will achieve this for each of the book
elements we just retrieved:
foreach (XmlNode book in nodeList)
{
MessageBox.Show(book.SelectSingleNode("title").InnerText);
}
That was simple, right? So, right now, I bet you can imagine all sorts of ways you could use XPath in your applications, and you would be right. There is still the slight problem of the XPath syntax.
Building XPath Expressions
As this is just a beginner's tutorial, I will go over only the basic XPath expressions and what they mean. As time goes on, you might want to try more complex XPath queries such as reading XML data in reverse (going from a child node to a parent node). This is outside the scope of this document, but there are plenty of other references on the internet that should be able to help.
The first important thing you should remember when using XPath is the context you are in when you try to use the expression. For example, if we use the /books
expression whilst we are trying to select the title of the book, we would have in fact been looking for a books
node within the book
node we were already in. For this to have worked, our XML document would have had to look something like:
<books>
<book>
<books>
<book></book>
</books>
</book>
</books>
From the above example, I hope you can understand the importance of context as this will become apparent when you try to put XPath to use in your applications.
So the first thing we might need to know how to do using XPath is selecting some nodes. Here are some examples of how you can do this:
Expression | Description | Example |
nodename | This will select all child nodes of the provided node name | books
This would select all nodes under the "books " element
|
/ | This will select all child nodes matching the proceeding expression from the root node | books/book
This would select all of the book nodes contained within the books element
|
// | This will select all nodes within the XML document from the current point if they match the proceeding expression | books//book
This would also select all of the book nodes contained within the books element, but if there was a book element outside of the books element, they would also be selected
|
@ | This is used to select the attributes of the nodes | books/book/data[@type='Price']
This would select all book elements that have a price attribute
|
It is also important to note that you can use indexes when you want to select a particular node. For example: /books[1]
will return the first "books
" element. Notice that indexes are not zero based.
Note: This is the W3C specification. In some browser's, this is implemented incorrectly and the browser treats the indexes as zero based. This should only be a concern if you are copying code that was originally written for certain browsers. I believe IE5 and IE6 fall foul of this, but I have never investigated to be truthful, so this may or may not be the case.
Recommended
I would seriously recommend that you download the source files for this project. It will allow you to play around with different XML data scenarios and XPath expressions easily and will help you to learn. I always say that it is better to learn from your mistakes than it is to not bother trying.
Conclusion
Well, that is about it for my first contribution to CodeProject. If you find this useful, or you would like me to write a follow up article which goes into some more detail on some of the more complex XPath expressions that might be needed, drop me an e-mail. My address can be found on my CodeProject profile page.
I can be contacted via e-mail at francisg04@gmail.com.
My blog can be found at http://csharpcollection.spaces.msn.com.