XMLLib for PUGXML with XPath
A library for PugXML which implements XPath
Introduction
The CPugXMLNode
class wraps the xml_node
class and provides additional XPath capability.
Background
The PUGXML parser is very useful in lightweight applications which use XML. However, it does not readily compile in VC++ 6.0, and it does not provide the ability to perform XPath queries. I am a huge fan of XML and XPath, but I didn't want to write an XPath query analyzer if I could find a good one on the web. I found an excellent implementation of XPath in the Blue Library, by Josh Harler. However, Josh implemented his own XML parser, and by the time I found it, I had too much time invested in PugXML. So, I modified his code to reference PugXML instead of his XML parser. I also rearranged his code so that it would build into a single library. I also had to change some of his class names to avoid name conflicts. But, other than that I left his code untouched and left the copyright notices in place.
XPath is a query language for accessing nodes in an XML document. There is an excellent tutorial located here. I used many of the examples for my test cases.
I had a few other technical problems to solve, which I will discuss in detail below.
I am sure these classes would do with additional enhancements. However, they meet my needs at the moment, and I am providing them in the hopes that you will find them useful as well. I am deeply indebted to all the contributors to CodeProject and SourceForge who make our job easier every day.
Using the Code
The download includes a VC++ 6.0 workspace to build XMLLib and link it with a small unit tester. You can use this as a sample. I have used XMLLib in MFC applications and in COM objects written with ATL. However, there are no dependencies with MFC or ATL, although I have written a small function to convert the XML text to a CString
, which I find very useful for debugging.
Add the following #include
to the top of your module:
#include "PugXMLNode.h"
Sample: Adding Existing Nodes onto the Main Document Tree
The following code creates a document object. PugXML provides the append_child()
function to allocate a new instance of xml_node
and append it to the parent node. But, what if you have created a separate XML tree and just want to graft it onto the original tree? PugXML provides no way to do this, but I have added a new function called append_child_noalloc()
which appends an existing node without allocating new memory.
In the following example, I create a new node pNode
using new_node()
, then I add it to the main tree using append_child_noalloc()
. Note that you can wrap pNode
up as an xml_node
and manipulate the former orphan node, which is now a part of the main XML tree. I find this feature useful in a data-collection app which I wrote. I needed to collect XML data for a number of different sources and merge the results into a single tree.
//
// Test PugXML
//
void TestXML()
{
xml_parser* xml = new xml_parser(); // Construct.
xml->create();
xml_node root = xml->document();
xml_node node = root.append_child(node_element); // Add a child element.
node.append_attribute("111", "222");
node.name("original");
DebugNode(node, "node");
// Correct way to construct a new outside of a document and append it
// No exceptions. No memory leaks.
xml_node_struct* pNode = new_node(); // Allocate node
xml_node Node = xml_node(pNode); // Wrap it in a xml_node class
Node.name("allocNode");
DebugNode(Node, "Node");
root.append_child_noalloc(pNode); // Append stand-alone node to main document
Node.append_attribute("attr1", "val1");
Node.append_attribute("attr2", "val2");
DebugNode(Node, "Node with attribute");
// Modify first node after appending second node
node.append_attribute("333", "444");
DebugNode(root, "root");
DebugNode(node, "original node with changes intact");
xml->clear(); // Clear for the next test.
delete xml;
// Test creating and destroying a stand-alone node
xml_node_struct* pNode1 = new_node(); // Allocate node
Node = xml_node(pNode1); // Wrap it in a xml_node class
Node.name("Node1");
Node.attribute("one") = "two";
DebugNode(Node, "Node 1");
free_node_recursive(pNode1);
}
Sample: XPath Queries
So far, I haven't shown any use for CPugXMLNode
. Everything has been implemented in xml_node
. This example shows how to run XPath queries using CPugXMLNode
.
CPugXMLNode
implements four XPath query functions: FindNodes
to return a set of zero or more nodes which match the search criteria; FindNode
returns the first matching node; FindValues
returns a set of values from the nodes which match the search criteria; FindValue
returns the value from the first node which matches the search criteria. A strict XPath implementation would also return attributes from matching nodes, but I have not implemented that feature. A workaround is to use FindNodes
to get a list of matching nodes and then access their attributes.
The following example creates an in-memory XML document and then wraps the nodes with
. It then calls the CPugXMLNode
CPugXMLNode::FindNodes
to return a set of matching nodes in an Array
. Array
and BString
(renamed from String
) are classes implemented in the Blue Library. Since they are used so heavily in the XPath classes, I decided to keep them. They are fairly simple to use if all you are using them for is running XPath queries. If you want to use them for other purposes, the source is included in the download.
//
// Build an XML document on the fly and perform an XPATH query on it
//
void TestXPath()
{
xml_parser* xml = new xml_parser(); // Construct.
xml->create();
xml_node root = xml->document();
CPugXMLNode Node = root.append_child(node_element);
Node.name("allocNode");
Node.append_attribute("attr1", "val1");
Node.append_attribute("attr2", "val2");
CPugXMLNode sub1 = Node.append_child(node_element);
sub1.name("sub1");
CPugXMLNode sub2 = Node.append_child(node_element);
sub2.name("sub2");
CPugXMLNode sub3 = Node.append_child(node_element);
sub3.name("sub3");
// Display what we have
DebugNode(Node, "Node with attribute");
// XPath queries
Array<xml_node_struct* /> nodes = Node.FindNodes("//sub2 | //sub3");
for (int i = 0; i < nodes.getSize(); i++)
{
CString strMsg;
strMsg.Format("Node %d", i);
DebugNode(xml_node(nodes[i]), strMsg);
}
}
Unit Tester
I have included a demo app/unit tester which is statically linked with XMLLib. Here is a list of test functions provided in the UnitTest demo app:
TestXML()
Illustratesappend_child_noalloc()
TestSample()
Excerpt from original PugXML artilesTestXPath()
Builds an XML document in memory and does a simple XPath queryTestXPathFile()
Reads an XML file, parses it, and performs an XPATH queryTestFindValues()
TestsFindValues()
Changes Made to PugXML
- Modified to compile under VC++ 6.0 as noted in comments to original PugXML article
- Fixed numerous bugs
- Made the functions
strcmpwild_astr()
andstrcmpwild_impl()
static - Added
node_attribute
type - SeeCPugXMLNode
- Added
append_node_noalloc()
andappend_child_noalloc()
to give the ability to append pre-allocated nodes
Enhancements Provided in the CPugXMLNode Wrapper Class
- XPath query functions:
FindNodes()
,FindNode()
,FindValues()
,FindValue()
AppendTree()
: Similar toappend_child_noalloc
except it performs a deep copy as opposed to inserting a pointerappend_child_element()
: Creates anelement_node
and names it in a single function callCAddAttributeNodes
andCDelAttributeNodes
: The XPath functions treat attributes the same as nodes so these classes (derived fromxml_tree_walker
) are used before and after running the XPath queries to add and delete attribute pseudo-child nodes.
Changes Made to Blue Library XPath Code
- Combined code in one directory
- Combined all libraries into one codebase
- Changed some class names for compatibility
- Rewrote XPath modules to work with PugXML
History
- 10/28/09
- Fixed memory leaks
- Modified code to compile under VS 2005 and VS 2008
- Cleaned up test data
- 12/01/03
- Fixed typos and added copyright information requested by original author
- 08/03/03
- Original release