Click here to Skip to main content
15,867,756 members
Articles / Programming Languages / XML

XMLLib for PUGXML with XPath

Rate me:
Please Sign up or sign in to vote.
4.33/5 (11 votes)
29 Oct 2009CPOL5 min read 125.6K   1.2K   38   32
A library for PugXML which implements XPath

Introduction

The CPugXMLNode class wraps the xml_node class and provides additional XPath capability.

Background

The PUGXML parser is very useful in lightweight applications which use XML. However, it does not readily compile in VC++ 6.0, and it does not provide the ability to perform XPath queries. I am a huge fan of XML and XPath, but I didn't want to write an XPath query analyzer if I could find a good one on the web. I found an excellent implementation of XPath in the Blue Library, by Josh Harler. However, Josh implemented his own XML parser, and by the time I found it, I had too much time invested in PugXML. So, I modified his code to reference PugXML instead of his XML parser. I also rearranged his code so that it would build into a single library. I also had to change some of his class names to avoid name conflicts. But, other than that I left his code untouched and left the copyright notices in place.

XPath is a query language for accessing nodes in an XML document. There is an excellent tutorial located here. I used many of the examples for my test cases.

I had a few other technical problems to solve, which I will discuss in detail below.

I am sure these classes would do with additional enhancements. However, they meet my needs at the moment, and I am providing them in the hopes that you will find them useful as well. I am deeply indebted to all the contributors to CodeProject and SourceForge who make our job easier every day.

Using the Code

The download includes a VC++ 6.0 workspace to build XMLLib and link it with a small unit tester. You can use this as a sample. I have used XMLLib in MFC applications and in COM objects written with ATL. However, there are no dependencies with MFC or ATL, although I have written a small function to convert the XML text to a CString, which I find very useful for debugging.

Add the following #include to the top of your module:

C++
#include "PugXMLNode.h"

Sample: Adding Existing Nodes onto the Main Document Tree

The following code creates a document object. PugXML provides the append_child() function to allocate a new instance of xml_node and append it to the parent node. But, what if you have created a separate XML tree and just want to graft it onto the original tree? PugXML provides no way to do this, but I have added a new function called append_child_noalloc() which appends an existing node without allocating new memory.

In the following example, I create a new node pNode using new_node(), then I add it to the main tree using append_child_noalloc(). Note that you can wrap pNode up as an xml_node and manipulate the former orphan node, which is now a part of the main XML tree. I find this feature useful in a data-collection app which I wrote. I needed to collect XML data for a number of different sources and merge the results into a single tree.

C++
//
// Test PugXML
//
void TestXML()
{
    xml_parser* xml = new xml_parser(); // Construct.
    xml->create();
    xml_node root = xml->document();
    xml_node node = root.append_child(node_element); // Add a child element.
    node.append_attribute("111", "222");
    node.name("original");
    DebugNode(node, "node");

    // Correct way to construct a new outside of a document and append it
    // No exceptions. No memory leaks.
    xml_node_struct* pNode = new_node();    // Allocate node
    xml_node Node = xml_node(pNode);        // Wrap it in a xml_node class
    Node.name("allocNode");
    DebugNode(Node, "Node");
    root.append_child_noalloc(pNode);       // Append stand-alone node to main document
    Node.append_attribute("attr1", "val1");
    Node.append_attribute("attr2", "val2");
    DebugNode(Node, "Node with attribute");

    // Modify first node after appending second node
    node.append_attribute("333", "444");

    DebugNode(root, "root");
    DebugNode(node, "original node with changes intact");
    xml->clear(); // Clear for the next test.
    delete xml;

    // Test creating and destroying a stand-alone node
    xml_node_struct* pNode1 = new_node();    // Allocate node
    Node = xml_node(pNode1);                 // Wrap it in a xml_node class
    Node.name("Node1");
    Node.attribute("one") = "two";
    DebugNode(Node, "Node 1");
    free_node_recursive(pNode1);
}

Sample: XPath Queries

So far, I haven't shown any use for CPugXMLNode. Everything has been implemented in xml_node. This example shows how to run XPath queries using CPugXMLNode.

CPugXMLNode implements four XPath query functions: FindNodes to return a set of zero or more nodes which match the search criteria; FindNode returns the first matching node; FindValues returns a set of values from the nodes which match the search criteria; FindValue returns the value from the first node which matches the search criteria. A strict XPath implementation would also return attributes from matching nodes, but I have not implemented that feature. A workaround is to use FindNodes to get a list of matching nodes and then access their attributes.

The following example creates an in-memory XML document and then wraps the nodes with <code>CPugXMLNode. It then calls the CPugXMLNode::FindNodes to return a set of matching nodes in an Array. Array and BString (renamed from String) are classes implemented in the Blue Library. Since they are used so heavily in the XPath classes, I decided to keep them. They are fairly simple to use if all you are using them for is running XPath queries. If you want to use them for other purposes, the source is included in the download.

C++
//
// Build an XML document on the fly and perform an XPATH query on it
//
void TestXPath()
{
    xml_parser* xml = new xml_parser(); // Construct.
    xml->create();
    xml_node root = xml->document();
    CPugXMLNode Node = root.append_child(node_element);

    Node.name("allocNode");
    Node.append_attribute("attr1", "val1");
    Node.append_attribute("attr2", "val2");

    CPugXMLNode sub1 = Node.append_child(node_element);
    sub1.name("sub1");
    CPugXMLNode sub2 = Node.append_child(node_element);
    sub2.name("sub2");
    CPugXMLNode sub3 = Node.append_child(node_element);
    sub3.name("sub3");

    // Display what we have
    DebugNode(Node, "Node with attribute");

    // XPath queries
    Array<xml_node_struct* /> nodes = Node.FindNodes("//sub2 | //sub3");
    for (int i = 0; i < nodes.getSize(); i++)
    {
        CString strMsg;
        strMsg.Format("Node %d", i);
        DebugNode(xml_node(nodes[i]), strMsg);
    }
}

Unit Tester

I have included a demo app/unit tester which is statically linked with XMLLib. Here is a list of test functions provided in the UnitTest demo app:

  • TestXML() Illustrates append_child_noalloc()
  • TestSample() Excerpt from original PugXML artiles
  • TestXPath() Builds an XML document in memory and does a simple XPath query
  • TestXPathFile() Reads an XML file, parses it, and performs an XPATH query
  • TestFindValues() Tests FindValues()

Changes Made to PugXML

  • Modified to compile under VC++ 6.0 as noted in comments to original PugXML article
  • Fixed numerous bugs
  • Made the functions strcmpwild_astr() and strcmpwild_impl() static
  • Added node_attribute type - See CPugXMLNode
  • Added append_node_noalloc() and append_child_noalloc() to give the ability to append pre-allocated nodes

Enhancements Provided in the CPugXMLNode Wrapper Class

  • XPath query functions: FindNodes(), FindNode(), FindValues(), FindValue()
  • AppendTree(): Similar to append_child_noalloc except it performs a deep copy as opposed to inserting a pointer
  • append_child_element(): Creates an element_node and names it in a single function call
  • CAddAttributeNodes and CDelAttributeNodes: The XPath functions treat attributes the same as nodes so these classes (derived from xml_tree_walker) are used before and after running the XPath queries to add and delete attribute pseudo-child nodes.

Changes Made to Blue Library XPath Code

  • Combined code in one directory
  • Combined all libraries into one codebase
  • Changed some class names for compatibility
  • Rewrote XPath modules to work with PugXML

History

  • 10/28/09
    • Fixed memory leaks
    • Modified code to compile under VS 2005 and VS 2008
    • Cleaned up test data
  • 12/01/03
    • Fixed typos and added copyright information requested by original author
  • 08/03/03
    • Original release

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Web Developer
United States United States
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
GeneralRe: error C2039: 'push_back' Pin
Iker Celorrio18-Mar-04 2:23
Iker Celorrio18-Mar-04 2:23 
GeneralDoes not comile with .NET 2003 (VC 7.1) Pin
Thomas Haase9-Dec-03 22:18
Thomas Haase9-Dec-03 22:18 
GeneralRe: Does not comile with .NET 2003 (VC 7.1) Pin
Armano5-Dec-04 10:49
Armano5-Dec-04 10:49 
GeneralRe: Does not comile with .NET 2003 (VC 7.1) Pin
Thomas Haase5-Dec-04 21:42
Thomas Haase5-Dec-04 21:42 
GeneralJust a question Pin
Anthony_Yio28-Aug-03 17:14
Anthony_Yio28-Aug-03 17:14 
GeneralRe: Just a question Pin
JCrane229-Aug-03 6:31
JCrane229-Aug-03 6:31 
GeneralRe: Just a question Pin
Anthony_Yio1-Sep-03 15:49
Anthony_Yio1-Sep-03 15:49 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.