Click here to Skip to main content
15,867,686 members
Articles / Programming Languages / XML
Article

Manipulate XML data with XPath and XmlDocument (C#)

Rate me:
Please Sign up or sign in to vote.
4.86/5 (156 votes)
4 Feb 2005CPOL4 min read 1.1M   13.4K   279   57
An article about manipulating XML source data.

Sample Image

Image 2

Introduction

Based on a section of easy-to-read XML source data, I'll show you how to select and locate XML nodes and navigate through them using XPathNavigator and XPathNodeIterator. I will provide a few straightforward samples about XPath expression with which you could follow without difficulty. In the last part, there is some sample code to update, insert and remove XML nodes.

Some Concepts

  • XML - Extensible Markup Language, describe data structures in text format and with your own vocabularies, which means it does not use predefined tags and the meaning of these tags are not well understood.
  • XSL - Extensible Stylesheet Language, is designed for expressing stylesheets for XML documents. XSL is to XML as CSS is to HTML.
  • XML Transformation - is a user-defined algorithm that transforms a given XML document to another format, such as XML, HTML, XHTML. The algorithm is described by XSL.
  • XSLT - is designed for use as part of XSL, transforming an XML document into another XML document, or another type of document that is recognized by a browser, like HTML or XHTML. XSLT uses XPath.
  • XPath - is a set of syntax rules for defining parts of an XML document.

To keep this article simple and clear, I'll break it down into two parts, and put XSL, XSLT to my next article.

Using the code

Here is the source XML data:

XML
<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>
  <cd country="USA">
    <title>Empire Burlesque</title>
    <artist>Bob Dylan</artist>
    <price>10.90</price>
  </cd>
  <cd country="UK">
    <title>Hide your heart</title>
    <artist>Bonnie Tyler</artist>
    <price>10.0</price>
  </cd>
  <cd country="USA">
    <title>Greatest Hits</title>
    <artist>Dolly Parton</artist>
    <price>9.90</price>
  </cd>
</catalog>

If you want to select all of the price elements, here is the code:

C#
using System.Xml;
using System.Xml.XPath;
....
string fileName = "data.xml";
XPathDocument doc = new XPathDocument(fileName);
XPathNavigator nav = doc.CreateNavigator();

// Compile a standard XPath expression
XPathExpression expr; 
expr = nav.Compile("/catalog/cd/price");
XPathNodeIterator iterator = nav.Select(expr);

// Iterate on the node set
listBox1.Items.Clear();
try
{
  while (iterator.MoveNext())
  {
     XPathNavigator nav2 = iterator.Current.Clone();
     listBox1.Items.Add("price: " + nav2.Value);
  }
}
catch(Exception ex) 
{
   Console.WriteLine(ex.Message);
}

In the above code, we used "/catalog/cd/price" to select all the price elements. If you just want to select all the cd elements with price greater than 10.0, you can use "/catalog/cd[price>10.0]". Here are some more examples of XPath expressions:

/catalogselects the root element
/catalog/cdselects all the cd elements of the catalog element
/catalog/cd/priceselects all the price elements of all the cd elements of the catalog element
/catalog/cd[price>10.0]selects all the cd elements with price greater than 10.0
starts with a slash(/)represents an absolute path to an element
starts with two slashes(//)selects all elements that satisfy the criteria
//cdselects all cd elements in the document
/catalog/cd/title | /catalog/cd/artistselects all the title and artist elements of the cd elements of catalog
//title | //artistselects all the title and artist elements in the document
/catalog/cd/*selects all the child elements of all cd elements of the catalog element
/catalog/*/priceselects all the price elements that are grandchildren of catalog
/*/*/priceselects all price elements which have two ancestors
//*selects all elements in the document
/catalog/cd[1]selects the first cd child of catalog
/catalog/cd[last()]selects the last cd child of catalog
/catalog/cd[price]selects all the cd elements that have price
/catalog/cd[price=10.90]selects cd elements with the price of 10.90
/catalog/cd[price=10.90]/priceselects all price elements with the price of 10.90
//@countryselects all "country" attributes
//cd[@country]selects cd elements which have a "country" attribute
//cd[@*]selects cd elements which have any attribute
//cd[@country='UK']selects cd elements with "country" attribute equal to 'UK'

To update a cd node, first I search out which node you are updating by SelectSingleNode, and then create a new cd element. After setting the InnerXml of the new node, call ReplaceChild method of XmlElement to update the document. The code is as follows:

C#
XmlTextReader reader = new XmlTextReader(FILE_NAME);
XmlDocument doc = new XmlDocument(); 
doc.Load(reader);
reader.Close();

//Select the cd node with the matching title
XmlNode oldCd;
XmlElement root = doc.DocumentElement;
oldCd = root.SelectSingleNode("/catalog/cd[title='" + oldTitle + "']");

XmlElement newCd = doc.CreateElement("cd");
newCd.SetAttribute("country",country.Text);

newCd.InnerXml = "<title>" + this.comboBox1.Text + "</title>" + 
        "<artist>" + artist.Text + "</artist>" +
        "<price>" + price.Text + "</price>";

root.ReplaceChild(newCd, oldCd);

//save the output to a file
doc.Save(FILE_NAME);

Similarly, use InsertAfter and RemoveChild to insert and remove a node, check it out in the demo. When you run the application, make sure that "data.xml" is in the same directory as the EXE file.

Points of Interest

Anyway, XmlDocument is an in-memory or cached tree representation of an XML document. It is somewhat resource-intensive, if you have a large XML document and not enough memory to consume, use XmlReader and XmlWriter for better performance.

History

Version 1.0, it's my first article on CP, I expect there are many flaws. The XML source data and the knowledge comes from the web and MSDN, I just wrote a demo app to show them. No copyright reserved.

Reference

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Canada Canada
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
GeneralMy vote of 5 Pin
Dell.Simmons28-Oct-10 15:01
Dell.Simmons28-Oct-10 15:01 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.