Introduction
The CPugXMLNode
class wraps the xml_node
class and provides additional XPath capability.
Background
The PUGXML parser is very useful in lightweight applications which use XML. However, it does not readily compile in VC++ 6.0, and it does not provide the ability to perform XPath queries. I am a huge fan of XML and XPath, but I didn't want to write an XPath query analyzer if I could find a good one on the web. I found an excellent implementation of XPath in the Blue Library, by Josh Harler. However, Josh implemented his own XML parser, and by the time I found it, I had too much time invested in PugXML. So, I modified his code to reference PugXML instead of his XML parser. I also rearranged his code so that it would build into a single library. I also had to change some of his class names to avoid name conflicts. But, other than that I left his code untouched and left the copyright notices in place.
XPath is a query language for accessing nodes in an XML document. There is an excellent tutorial located here. I used many of the examples for my test cases.
I had a few other technical problems to solve, which I will discuss in detail below.
I am sure these classes would do with additional enhancements. However, they meet my needs at the moment, and I am providing them in the hopes that you will find them useful as well. I am deeply indebted to all the contributors to CodeProject and SourceForge who make our job easier every day.
Using the Code
The download includes a VC++ 6.0 workspace to build XMLLib and link it with a small unit tester. You can use this as a sample. I have used XMLLib in MFC applications and in COM objects written with ATL. However, there are no dependencies with MFC or ATL, although I have written a small function to convert the XML text to a CString
, which I find very useful for debugging.
Add the following #include
to the top of your module:
#include "PugXMLNode.h"
Sample: Adding Existing Nodes onto the Main Document Tree
The following code creates a document object. PugXML provides the append_child()
function to allocate a new instance of xml_node
and append it to the parent node. But, what if you have created a separate XML tree and just want to graft it onto the original tree? PugXML provides no way to do this, but I have added a new function called append_child_noalloc()
which appends an existing node without allocating new memory.
In the following example, I create a new node pNode
using new_node()
, then I add it to the main tree using append_child_noalloc()
. Note that you can wrap pNode
up as an xml_node
and manipulate the former orphan node, which is now a part of the main XML tree. I find this feature useful in a data-collection app which I wrote. I needed to collect XML data for a number of different sources and merge the results into a single tree.
void TestXML()
{
xml_parser* xml = new xml_parser(); xml->create();
xml_node root = xml->document();
xml_node node = root.append_child(node_element); node.append_attribute("111", "222");
node.name("original");
DebugNode(node, "node");
xml_node_struct* pNode = new_node(); xml_node Node = xml_node(pNode); Node.name("allocNode");
DebugNode(Node, "Node");
root.append_child_noalloc(pNode); Node.append_attribute("attr1", "val1");
Node.append_attribute("attr2", "val2");
DebugNode(Node, "Node with attribute");
node.append_attribute("333", "444");
DebugNode(root, "root");
DebugNode(node, "original node with changes intact");
xml->clear(); delete xml;
xml_node_struct* pNode1 = new_node(); Node = xml_node(pNode1); Node.name("Node1");
Node.attribute("one") = "two";
DebugNode(Node, "Node 1");
free_node_recursive(pNode1);
}
Sample: XPath Queries
So far, I haven't shown any use for CPugXMLNode
. Everything has been implemented in xml_node
. This example shows how to run XPath queries using CPugXMLNode
.
CPugXMLNode
implements four XPath query functions: FindNodes
to return a set of zero or more nodes which match the search criteria; FindNode
returns the first matching node; FindValues
returns a set of values from the nodes which match the search criteria; FindValue
returns the value from the first node which matches the search criteria. A strict XPath implementation would also return attributes from matching nodes, but I have not implemented that feature. A workaround is to use FindNodes
to get a list of matching nodes and then access their attributes.
The following example creates an in-memory XML document and then wraps the nodes with CPugXMLNode
. It then calls the CPugXMLNode::FindNodes
to return a set of matching nodes in an Array
. Array
and BString
(renamed from String
) are classes implemented in the Blue Library. Since they are used so heavily in the XPath classes, I decided to keep them. They are fairly simple to use if all you are using them for is running XPath queries. If you want to use them for other purposes, the source is included in the download.
void TestXPath()
{
xml_parser* xml = new xml_parser(); xml->create();
xml_node root = xml->document();
CPugXMLNode Node = root.append_child(node_element);
Node.name("allocNode");
Node.append_attribute("attr1", "val1");
Node.append_attribute("attr2", "val2");
CPugXMLNode sub1 = Node.append_child(node_element);
sub1.name("sub1");
CPugXMLNode sub2 = Node.append_child(node_element);
sub2.name("sub2");
CPugXMLNode sub3 = Node.append_child(node_element);
sub3.name("sub3");
DebugNode(Node, "Node with attribute");
Array<xml_node_struct* /> nodes = Node.FindNodes("//sub2 | //sub3");
for (int i = 0; i < nodes.getSize(); i++)
{
CString strMsg;
strMsg.Format("Node %d", i);
DebugNode(xml_node(nodes[i]), strMsg);
}
}
Unit Tester
I have included a demo app/unit tester which is statically linked with XMLLib. Here is a list of test functions provided in the UnitTest demo app:
TestXML()
Illustrates append_child_noalloc()
TestSample()
Excerpt from original PugXML artiles
TestXPath()
Builds an XML document in memory and does a simple XPath query
TestXPathFile()
Reads an XML file, parses it, and performs an XPATH query
TestFindValues()
Tests FindValues()
Changes Made to PugXML
- Modified to compile under VC++ 6.0 as noted in comments to original PugXML article
- Fixed numerous bugs
- Made the functions
strcmpwild_astr()
and strcmpwild_impl()
static
- Added
node_attribute
type - See CPugXMLNode
- Added
append_node_noalloc()
and append_child_noalloc()
to give the ability to append pre-allocated nodes
Enhancements Provided in the CPugXMLNode Wrapper Class
- XPath query functions:
FindNodes()
, FindNode()
, FindValues()
, FindValue()
AppendTree()
: Similar to append_child_noalloc
except it performs a deep copy as opposed to inserting a pointer
append_child_element()
: Creates an element_node
and names it in a single function call
CAddAttributeNodes
and CDelAttributeNodes
: The XPath functions treat attributes the same as nodes so these classes (derived from xml_tree_walker
) are used before and after running the XPath queries to add and delete attribute pseudo-child nodes.
Changes Made to Blue Library XPath Code
- Combined code in one directory
- Combined all libraries into one codebase
- Changed some class names for compatibility
- Rewrote XPath modules to work with PugXML
History
- 10/28/09
- Fixed memory leaks
- Modified code to compile under VS 2005 and VS 2008
- Cleaned up test data
- 12/01/03
- Fixed typos and added copyright information requested by original author
- 08/03/03