Click here to Skip to main content
15,860,859 members
Articles / Programming Languages / XML
Article

XmlBind: putting PugXML on steroïds !

Rate me:
Please Sign up or sign in to vote.
3.36/5 (10 votes)
2 Oct 20035 min read 131.5K   579   31   24
A mutant XML parser using IoBind, EDOM and PugXML.

Introduction

XmlBind is an XML I/O helper class. It is based on three ingredients:

  • PugXML a great XML DOM parser available on CP (see [1]),
  • EDOM, presented by firstobject.com, a simplified approach for creating and exploring XML files, (see [2]),
  • IoBind, a library for serializing objects to/from strings, (see [3]),

The goal of XmlBind is to ease up adding and retrieving information in XML:

  • Use PugXML a powerful, fast and complete XML class,
  • Avoid DOM lengthy operations: add a node and value in one function call,
  • Easy traversal of the DOM tree,
  • Do not worry about conversion to/from string: let IoBind take care of it for you:
    • STL container serialization,
    • base64 conversion,
    • XML reserved characters escaping,
    • etc...
  • Intelligent skipping of nodes.

Quick example

Suppose that you have a small structure that you want to be serialized in XML:

C++
struct data_set
{
    string name;
    vector< pair<float,float> > points;
}

The XML output file would look like this:

XML
<data_set>
    <name>the name</name>
    <points>(0,1),(2,3),...</points>
</data_set>

Classic solution

If you use DOM and do not have any helper to transform points into a string, this can be a lengthy job:

C++
datat_set d;
xml_node node;

// creating data set node
xml_node data_set_node=node.append_child( node_element );
data_set.name("data_set");

// adding name
xml_node name_node=data_set_node.append_child( node_element );
name_node.name("name");
xml_node name_value_node=name_node.append_child( node_pcdata );
name_value_node.value( d.name );

// convertin points to string
ostringstream ouput;
output<<"("<<d.points[i].first<<":"<<d.points[i].second<<")";
for (size_t i = 0; i< d.points.size(); ++i)
    output<<",("<<d.points[i].first<<":"<<d.points[i].second<<")";

// adding points to xml
xml_node points_node=data_set_node.append_child( node_element );
points_node.name("points");
xml_node points_value_node=points_node.append_child( node_pcdata );
points_value_node.value( output.str() );

This already seems long enough and I'm not speaking about parsing back the data.

XmlBind solution

XmlBind provides some handy and very customizable wrappers for writing data to XML:

C++
xml_bind xb;
xb.add_child_elem("data_set");
{
    // going deeper into the xml tree 
    // when si is destroyed, we go back
    scoped_into si(xb);
    // adding a child element + value
    xb.add_child_elem("name",d.name);
    // adding a child element + rendering values to string
    xb.add_child_elem_p(
         "points",
         d.points.begin(),
         d.points.end(),
         sequence_to_string_p << pair_to_string_p );
}

As you can see, the XmlBind solution is much shorter and more intuitive. There are two factors for that:

  • EDOM simplifies the creation of XML files,
  • IoBind has been used to transform points into a string

Note also, that XmlBind gives you the tool to read back the data from the created XML document:

C++
xml_bind xml(string);

if(!xb.find_child_elem("data_set"))
    return false;

{ 
    scoped_into si(xb);
    xb.find_get_child_data("name",d.name);
    xb.find_get_child_data_p(
       "points", 
       d.points,
       sequence_from_string_p 
         << (
             pair_from_string_p
               << from_string<float>()
               >> from_string<float>()
            )
}

This concludes this small example. Below you will understand how XmlBind works and how it can work for you.

XmlBind overview

XmlBind comes as a single wrapper around PugXML xml_parser:

class xml_bind  : public xml_parser

xml_bind contains a "cursor" that stores the current node and the current child node. This cursor is used to:

  • Add new nodes after the current node,
  • Get the data of the current node,
  • Traverse the XML tree.

xml_bind provides several helper methods to append and retrieve data.

Tree traversal

Going up and down in the tree

As stated above, xml_bind contains a cursor that can be used to explore the XML tree. This is the EDOM idea.

To go down into the tree depth, use into_elem() and to go out use out_of_elem(). As in the example, you can also use the helper class scoped_into that takes care of that for you. Here is a piece of XML and the code to explore it:

XML
<node>
   <child>
      <subchild/>
   </child>
   <child2/>
</node>
C++
xml_bind xb(xml_string);
xb.find_child_elem("node");$
{
    scoped_into si(xb);
    xb.find_child_elem("child");
    {
        scoped_into si(xb);
        x.find_child_elem("subchild");
    }
    xml.find_child_elem("child2");
}

As you see, you can intuitively traverse your XML tree using the cursor.

Iterating elements

You can iterate the elements or child elements using a while(next) semantic similar to .NET enumerators:

C++
//reseting child cursor
reset_child();
// next_child returns false if the end was reach
while( xb.next_child())
{
    // getting the current node
    xml_node n( xb.get_current_child() );
}

Of course, you can do that for elements and the iteration can be made on the node name:

C++
// iterating on node data
while( xb.next_child("data"))
...

Skippin nodes

When traversing the tree, you might want to skip some node types such as comment or DTD declarations: to do so, you can specify which node type should be skipped:

C++
xb.skip_pi(); // skips pi declaration,
xb.skip_dtd(); // skips dtd declartion,
xb.skip_comment(false); // do not skip comments
xb.skip( node_cdata, true ); // skip cdata nodes

By default, no types are skipped.

Saving, restoring the cursor state

You can save and restore the cursor state:

C++
xml_bind::state st( xb.get_state() );
// doing stuff on xb...
xb.set_state(st);

Appending data

Methods

Adding data at the current cursor position is done using the append_child_elem methods. These methods apply to single value or a range of iterators:

  • bool append_child_elem(LPCTSTR name_)

    Adds an empty node, named name_. Note that all methods accept LPCTSTR or STL string.

  • template<typename T>
    bool append_chid_elem(
             LPCTSTR name_, 
             T const& value_, 
             bool as_cdata_ = false)

    Adds a node named name_ and with value value_. If as_cdata_ is true, the value is added in a CDATA node. Note that T must support << with ostream.

  • template<typename T, typename Policy>
    bool append_chid_elem_p(
             LPCTSTR name_,
             T const& 
             value_, Policy const& 
             policy_, bool as_cdata_
             = false)

    Adds a node named name_ and with value_, converted to string by policy_( I will explain them below), as value_.

  • C++
    template<typename ContainerIterator>
    bool append_child_elem(
        LPCTSTR name_, 
        ContainerIterator begin_,
        ContainerIterator end_,
        bool as_cdata_ = false)

    Adds a node named name_, converts the range of data described by begin_ and end_ to string as the data.

  • template<typename ContainerIterator, typename Policy>
    bool append_child_elem_p(
        LPCTSTR name_, 
        ContainerIterator begin_,
        ContainerIterator end_,
            Policy const& policy_,
        bool as_cdata_ = false)

    Adds a node named name_, converts the range of data described by begin_ and end_ to string using policy_ as the data.

There are a number of remarks to be done on these methods:

  • CDATA: For each method, you can choose to add the data in a CDATA section or not, using as_cdata_ parameter (default is not). If the data is not added in a CDATA section, it is escaped (< to < etc...), otherwise it is added unchanged,
  • Policies: The conversion of the data to string are done using policies. You can provide your own policy in order to suit your needs. See Iobind ([3]) for more details.

Examples

  • Add a string

    C++
    xml.add_child_elem("name","string");
  • Add a double

    C++
    double d;
    xml.add_child_elem("name",d);
  • Add a string in base64

    C++
    xml.add_child_elem_p("name","string", to_base64_p );
  • Add a string as CDATA

    C++
    xml.add_child_elem("name","string", true);
  • Add a container of ints

    C++
    vector<int> v;
    xml.add_child_elem_p("name",v.begin(),v.end(), sequence_to_string_p);
  • Add a map

    C++
    map<int, string> m;
    xml.add_child_elem_p("name",m.begin(),m.end(), 
        sequence_to_string_p << pair_to_string_p);

Reading back data

Methods

Data can be retrieved using the get_child_data methods that take and transform the data from the current node:

  • template<typename T>
    bool get_child_data(T& value_)

    Reads the data of the current child node, transforms it to T and stores it into value_.

  • template<typename T, typename Policy>
    bool get_child_data_p(T& value_, Policy const& policy_)

    Reads the data of the current child node, transforms it to T using policy_ and stores it into value_.

Note that all these methods return true if successful, false otherwise. If you are looking for a specific node you can use find_get_data:

  • template<typename T>
    bool find_get_child_data(LPCTSTR name_,T& value_)

    Reads the data of the first node named name_, transforms it to T and stores it into value_.

  • template<typename T, typename Policy>
    bool find_get_child_data_p(LPCTSTR name_, T& value_, 
                                           Policy const& policy_)

    Reads the data of the first node named name_, transforms it to T using policy_ and stores it into value_.

Examples

  • Get value as string

    string s;
    xml.get_child_data(s);
  • Get value as float

    float f;
    xml.get_child_data(f);
  • Get value as pair<int,float>,

    pair<int,float> p;
    xml.get_child_data_p(
        p, 
        pair_from_string_p 
            << from<int>()
            >> from<float>()
        );
  • Get value as vector<float>

    vector<float> v;
    xml.get_child_data(
        v,
        sequence_from_string_p 
           << from<float>()
        );
  • find elem and retrieve value

    float f;
    xml.find_get_child_data("name",f);

Compiling XmlBind

You will need Boost, IoBind to use XMLBind:

You need to compile IoBind. The declaration of xml_bind is in iobind/xml/xml_bind.hpp. Note that it is in the iobind::xml namespace.

Links

History

  • 2-10-2003, big update:
    • integrated PugXml in IoBind, split the .h in .cpp, .h
    • adding skipping,
    • added STL string methods,
    • fixed tree traversal problems
  • 05-07-2003, initial release.

Reference

  1. PugXML - A Small, Pugnacious XML Parser by Kristen Wegner,
  2. XML class for processing and building simple XML documents by Ben Bryant
  3. IoBind, a serializer code factory. by Jonathan de Halleux
  4. CMarkupArchive, an extension to CMarkup by Jonathan de Halleux

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Engineer
United States United States
Jonathan de Halleux is Civil Engineer in Applied Mathematics. He finished his PhD in 2004 in the rainy country of Belgium. After 2 years in the Common Language Runtime (i.e. .net), he is now working at Microsoft Research on Pex (http://research.microsoft.com/pex).

Comments and Discussions

 
QuestionHas any one know how to read attributes with xmlbind? Pin
MagicMD3-Apr-07 20:32
MagicMD3-Apr-07 20:32 
Generalmore examples Pin
funvill12-Feb-04 20:42
funvill12-Feb-04 20:42 
QuestionML A bug? Pin
hihint11-Nov-03 20:45
hihint11-Nov-03 20:45 
AnswerRe: ML A bug? Pin
Anonymous11-Nov-03 20:53
Anonymous11-Nov-03 20:53 
GeneralRe: ML A bug? Pin
MagicMD30-Apr-07 16:53
MagicMD30-Apr-07 16:53 
GeneralYou are having a big responsability on my programming style... Pin
andyj1157-Oct-03 4:12
andyj1157-Oct-03 4:12 
GeneralRe: You are having a big responsability on my programming style... Pin
Jonathan de Halleux7-Oct-03 6:07
Jonathan de Halleux7-Oct-03 6:07 
GeneralReading XML from file Pin
Simon Riise13-Jun-03 3:29
sussSimon Riise13-Jun-03 3:29 
GeneralRe: Reading XML from file -&gt; must set the cursor Pin
Jonathan de Halleux13-Jun-03 4:41
Jonathan de Halleux13-Jun-03 4:41 
GeneralRe: Reading XML from file -&gt; must set the cursor Pin
Simon Riise13-Jun-03 6:13
sussSimon Riise13-Jun-03 6:13 
GeneralRe: Reading XML from file -&gt; must set the cursor Pin
Jonathan de Halleux2-Oct-03 23:55
Jonathan de Halleux2-Oct-03 23:55 
GeneralRe: Reading XML from file -&gt; must set the cursor Pin
matom3-Nov-03 12:43
matom3-Nov-03 12:43 
GeneralRe: Reading XML from file -&gt; must set the cursor Pin
Jonathan de Halleux3-Nov-03 23:50
Jonathan de Halleux3-Nov-03 23:50 
GeneralRe: Reading XML from file -&gt; must set the cursor Pin
matom7-Nov-03 6:39
matom7-Nov-03 6:39 
GeneralExcellent Idea, doubt in methods Pin
vgrigor9-Jun-03 21:33
vgrigor9-Jun-03 21:33 
GeneralRe: Excellent Idea, doubt in methods Pin
Jonathan de Halleux9-Jun-03 21:46
Jonathan de Halleux9-Jun-03 21:46 
GeneralRe: Excellent Idea, doubt in methods Pin
vgrigor9-Jun-03 22:04
vgrigor9-Jun-03 22:04 
GeneralRe: Excellent Idea, doubt in methods Pin
Jonathan de Halleux9-Jun-03 22:10
Jonathan de Halleux9-Jun-03 22:10 
GeneralRe: Excellent Idea, doubt in methods Pin
SGarratt3-Oct-03 9:12
SGarratt3-Oct-03 9:12 
GeneralRe: Excellent Idea, doubt in methods Pin
Anonymous3-Oct-03 9:51
Anonymous3-Oct-03 9:51 
GeneralRe: Excellent Idea, doubt in methods Pin
SGarratt3-Oct-03 12:07
SGarratt3-Oct-03 12:07 
GeneralRe: Excellent Idea, doubt in methods Pin
Jonathan de Halleux6-Oct-03 20:05
Jonathan de Halleux6-Oct-03 20:05 
GeneralFile not found Pin
Majid Shahabfar8-Jun-03 22:52
Majid Shahabfar8-Jun-03 22:52 
GeneralRe: File not found Pin
Anonymous9-Jun-03 0:07
Anonymous9-Jun-03 0:07 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.