65.9K
CodeProject is changing. Read more.
Home

XmlBind: putting PugXML on steroïds !

starIconstarIconstarIcon
emptyStarIcon
starIcon
emptyStarIcon

3.36/5 (10 votes)

Jun 8, 2003

5 min read

viewsIcon

133765

downloadIcon

580

A mutant XML parser using IoBind, EDOM and PugXML.

Introduction

XmlBind is an XML I/O helper class. It is based on three ingredients:

  • PugXML a great XML DOM parser available on CP (see [1]),
  • EDOM, presented by firstobject.com, a simplified approach for creating and exploring XML files, (see [2]),
  • IoBind, a library for serializing objects to/from strings, (see [3]),

The goal of XmlBind is to ease up adding and retrieving information in XML:

  • Use PugXML a powerful, fast and complete XML class,
  • Avoid DOM lengthy operations: add a node and value in one function call,
  • Easy traversal of the DOM tree,
  • Do not worry about conversion to/from string: let IoBind take care of it for you:
    • STL container serialization,
    • base64 conversion,
    • XML reserved characters escaping,
    • etc...
  • Intelligent skipping of nodes.

Quick example

Suppose that you have a small structure that you want to be serialized in XML:

struct data_set
{
    string name;
    vector< pair<float,float> > points;
}

The XML output file would look like this:

<data_set>
    <name>the name</name>
    <points>(0,1),(2,3),...</points>
</data_set>

Classic solution

If you use DOM and do not have any helper to transform points into a string, this can be a lengthy job:

datat_set d;
xml_node node;

// creating data set node

xml_node data_set_node=node.append_child( node_element );
data_set.name("data_set");

// adding name

xml_node name_node=data_set_node.append_child( node_element );
name_node.name("name");
xml_node name_value_node=name_node.append_child( node_pcdata );
name_value_node.value( d.name );

// convertin points to string

ostringstream ouput;
output<<"("<<d.points[i].first<<":"<<d.points[i].second<<")";
for (size_t i = 0; i< d.points.size(); ++i)
    output<<",("<<d.points[i].first<<":"<<d.points[i].second<<")";

// adding points to xml

xml_node points_node=data_set_node.append_child( node_element );
points_node.name("points");
xml_node points_value_node=points_node.append_child( node_pcdata );
points_value_node.value( output.str() );

This already seems long enough and I'm not speaking about parsing back the data.

XmlBind solution

XmlBind provides some handy and very customizable wrappers for writing data to XML:

xml_bind xb;
xb.add_child_elem("data_set");
{
    // going deeper into the xml tree 

    // when si is destroyed, we go back

    scoped_into si(xb);
    // adding a child element + value

    xb.add_child_elem("name",d.name);
    // adding a child element + rendering values to string

    xb.add_child_elem_p(
         "points",
         d.points.begin(),
         d.points.end(),
         sequence_to_string_p << pair_to_string_p );
}

As you can see, the XmlBind solution is much shorter and more intuitive. There are two factors for that:

  • EDOM simplifies the creation of XML files,
  • IoBind has been used to transform points into a string

Note also, that XmlBind gives you the tool to read back the data from the created XML document:

xml_bind xml(string);

if(!xb.find_child_elem("data_set"))
    return false;

{ 
    scoped_into si(xb);
    xb.find_get_child_data("name",d.name);
    xb.find_get_child_data_p(
       "points", 
       d.points,
       sequence_from_string_p 
         << (
             pair_from_string_p
               << from_string<float>()
               >> from_string<float>()
            )
}

This concludes this small example. Below you will understand how XmlBind works and how it can work for you.

XmlBind overview

XmlBind comes as a single wrapper around PugXML xml_parser:

class xml_bind  : public xml_parser

xml_bind contains a "cursor" that stores the current node and the current child node. This cursor is used to:

  • Add new nodes after the current node,
  • Get the data of the current node,
  • Traverse the XML tree.

xml_bind provides several helper methods to append and retrieve data.

Tree traversal

Going up and down in the tree

As stated above, xml_bind contains a cursor that can be used to explore the XML tree. This is the EDOM idea.

To go down into the tree depth, use into_elem() and to go out use out_of_elem(). As in the example, you can also use the helper class scoped_into that takes care of that for you. Here is a piece of XML and the code to explore it:

<node>
   <child>
      <subchild/>
   </child>
   <child2/>
</node>
xml_bind xb(xml_string);
xb.find_child_elem("node");$
{
    scoped_into si(xb);
    xb.find_child_elem("child");
    {
        scoped_into si(xb);
        x.find_child_elem("subchild");
    }
    xml.find_child_elem("child2");
}

As you see, you can intuitively traverse your XML tree using the cursor.

Iterating elements

You can iterate the elements or child elements using a while(next) semantic similar to .NET enumerators:

//reseting child cursor

reset_child();
// next_child returns false if the end was reach

while( xb.next_child())
{
    // getting the current node

    xml_node n( xb.get_current_child() );
}

Of course, you can do that for elements and the iteration can be made on the node name:

// iterating on node data

while( xb.next_child("data"))
...

Skippin nodes

When traversing the tree, you might want to skip some node types such as comment or DTD declarations: to do so, you can specify which node type should be skipped:

xb.skip_pi(); // skips pi declaration,

xb.skip_dtd(); // skips dtd declartion,

xb.skip_comment(false); // do not skip comments

xb.skip( node_cdata, true ); // skip cdata nodes

By default, no types are skipped.

Saving, restoring the cursor state

You can save and restore the cursor state:

xml_bind::state st( xb.get_state() );
// doing stuff on xb...

xb.set_state(st);

Appending data

Methods

Adding data at the current cursor position is done using the append_child_elem methods. These methods apply to single value or a range of iterators:

  • bool append_child_elem(LPCTSTR name_)

    Adds an empty node, named name_. Note that all methods accept LPCTSTR or STL string.

  • template<typename T>
    bool append_chid_elem(
             LPCTSTR name_, 
             T const& value_, 
             bool as_cdata_ = false)

    Adds a node named name_ and with value value_. If as_cdata_ is true, the value is added in a CDATA node. Note that T must support << with ostream.

  • template<typename T, typename Policy>
    bool append_chid_elem_p(
             LPCTSTR name_,
             T const& 
             value_, Policy const& 
             policy_, bool as_cdata_
             = false)

    Adds a node named name_ and with value_, converted to string by policy_( I will explain them below), as value_.

  • template<typename ContainerIterator>
    bool append_child_elem(
        LPCTSTR name_, 
        ContainerIterator begin_,
        ContainerIterator end_,
        bool as_cdata_ = false)

    Adds a node named name_, converts the range of data described by begin_ and end_ to string as the data.

  • template<typename ContainerIterator, typename Policy>
    bool append_child_elem_p(
        LPCTSTR name_, 
        ContainerIterator begin_,
        ContainerIterator end_,
            Policy const& policy_,
        bool as_cdata_ = false)

    Adds a node named name_, converts the range of data described by begin_ and end_ to string using policy_ as the data.

There are a number of remarks to be done on these methods:

  • CDATA: For each method, you can choose to add the data in a CDATA section or not, using as_cdata_ parameter (default is not). If the data is not added in a CDATA section, it is escaped (< to < etc...), otherwise it is added unchanged,
  • Policies: The conversion of the data to string are done using policies. You can provide your own policy in order to suit your needs. See Iobind ([3]) for more details.

Examples

  • Add a string

    xml.add_child_elem("name","string");
  • Add a double

    double d;
    xml.add_child_elem("name",d);
  • Add a string in base64

    xml.add_child_elem_p("name","string", to_base64_p );
  • Add a string as CDATA

    xml.add_child_elem("name","string", true);
  • Add a container of ints

    vector<int> v;
    xml.add_child_elem_p("name",v.begin(),v.end(), sequence_to_string_p);
  • Add a map

    map<int, string> m;
    xml.add_child_elem_p("name",m.begin(),m.end(), 
        sequence_to_string_p << pair_to_string_p);

Reading back data

Methods

Data can be retrieved using the get_child_data methods that take and transform the data from the current node:

  • template<typename T>
    bool get_child_data(T& value_)

    Reads the data of the current child node, transforms it to T and stores it into value_.

  • template<typename T, typename Policy>
    bool get_child_data_p(T& value_, Policy const& policy_)

    Reads the data of the current child node, transforms it to T using policy_ and stores it into value_.

Note that all these methods return true if successful, false otherwise. If you are looking for a specific node you can use find_get_data:

  • template<typename T>
    bool find_get_child_data(LPCTSTR name_,T& value_)

    Reads the data of the first node named name_, transforms it to T and stores it into value_.

  • template<typename T, typename Policy>
    bool find_get_child_data_p(LPCTSTR name_, T& value_, 
                                           Policy const& policy_)

    Reads the data of the first node named name_, transforms it to T using policy_ and stores it into value_.

Examples

  • Get value as string

    string s;
    xml.get_child_data(s);
  • Get value as float

    float f;
    xml.get_child_data(f);
  • Get value as pair<int,float>,

    pair<int,float> p;
    xml.get_child_data_p(
        p, 
        pair_from_string_p 
            << from<int>()
            >> from<float>()
        );
  • Get value as vector<float>

    vector<float> v;
    xml.get_child_data(
        v,
        sequence_from_string_p 
           << from<float>()
        );
  • find elem and retrieve value

    float f;
    xml.find_get_child_data("name",f);

Compiling XmlBind

You will need Boost, IoBind to use XMLBind:

You need to compile IoBind. The declaration of xml_bind is in iobind/xml/xml_bind.hpp. Note that it is in the iobind::xml namespace.

Links

History

  • 2-10-2003, big update:
    • integrated PugXml in IoBind, split the .h in .cpp, .h
    • adding skipping,
    • added STL string methods,
    • fixed tree traversal problems
  • 05-07-2003, initial release.

Reference

  1. PugXML - A Small, Pugnacious XML Parser by Kristen Wegner,
  2. XML class for processing and building simple XML documents by Ben Bryant
  3. IoBind, a serializer code factory. by Jonathan de Halleux
  4. CMarkupArchive, an extension to CMarkup by Jonathan de Halleux