Click here to Skip to main content
Click here to Skip to main content

XmlBind: putting PugXML on steroïds !

By , 2 Oct 2003
 

Introduction

XmlBind is an XML I/O helper class. It is based on three ingredients:

  • PugXML a great XML DOM parser available on CP (see [1]),
  • EDOM, presented by firstobject.com, a simplified approach for creating and exploring XML files, (see [2]),
  • IoBind, a library for serializing objects to/from strings, (see [3]),

The goal of XmlBind is to ease up adding and retrieving information in XML:

  • Use PugXML a powerful, fast and complete XML class,
  • Avoid DOM lengthy operations: add a node and value in one function call,
  • Easy traversal of the DOM tree,
  • Do not worry about conversion to/from string: let IoBind take care of it for you:
    • STL container serialization,
    • base64 conversion,
    • XML reserved characters escaping,
    • etc...
  • Intelligent skipping of nodes.

Quick example

Suppose that you have a small structure that you want to be serialized in XML:

struct data_set
{
    string name;
    vector< pair<float,float> > points;
}

The XML output file would look like this:

<data_set>
    <name>the name</name>
    <points>(0,1),(2,3),...</points>
</data_set>

Classic solution

If you use DOM and do not have any helper to transform points into a string, this can be a lengthy job:

datat_set d;
xml_node node;

// creating data set node
xml_node data_set_node=node.append_child( node_element );
data_set.name("data_set");

// adding name
xml_node name_node=data_set_node.append_child( node_element );
name_node.name("name");
xml_node name_value_node=name_node.append_child( node_pcdata );
name_value_node.value( d.name );

// convertin points to string
ostringstream ouput;
output<<"("<<d.points[i].first<<":"<<d.points[i].second<<")";
for (size_t i = 0; i< d.points.size(); ++i)
    output<<",("<<d.points[i].first<<":"<<d.points[i].second<<")";

// adding points to xml
xml_node points_node=data_set_node.append_child( node_element );
points_node.name("points");
xml_node points_value_node=points_node.append_child( node_pcdata );
points_value_node.value( output.str() );

This already seems long enough and I'm not speaking about parsing back the data.

XmlBind solution

XmlBind provides some handy and very customizable wrappers for writing data to XML:

xml_bind xb;
xb.add_child_elem("data_set");
{
    // going deeper into the xml tree 
    // when si is destroyed, we go back
    scoped_into si(xb);
    // adding a child element + value
    xb.add_child_elem("name",d.name);
    // adding a child element + rendering values to string
    xb.add_child_elem_p(
         "points",
         d.points.begin(),
         d.points.end(),
         sequence_to_string_p << pair_to_string_p );
}

As you can see, the XmlBind solution is much shorter and more intuitive. There are two factors for that:

  • EDOM simplifies the creation of XML files,
  • IoBind has been used to transform points into a string

Note also, that XmlBind gives you the tool to read back the data from the created XML document:

xml_bind xml(string);

if(!xb.find_child_elem("data_set"))
    return false;

{ 
    scoped_into si(xb);
    xb.find_get_child_data("name",d.name);
    xb.find_get_child_data_p(
       "points", 
       d.points,
       sequence_from_string_p 
         << (
             pair_from_string_p
               << from_string<float>()
               >> from_string<float>()
            )
}

This concludes this small example. Below you will understand how XmlBind works and how it can work for you.

XmlBind overview

XmlBind comes as a single wrapper around PugXML xml_parser:

class xml_bind  : public xml_parser

xml_bind contains a "cursor" that stores the current node and the current child node. This cursor is used to:

  • Add new nodes after the current node,
  • Get the data of the current node,
  • Traverse the XML tree.

xml_bind provides several helper methods to append and retrieve data.

Tree traversal

Going up and down in the tree

As stated above, xml_bind contains a cursor that can be used to explore the XML tree. This is the EDOM idea.

To go down into the tree depth, use into_elem() and to go out use out_of_elem(). As in the example, you can also use the helper class scoped_into that takes care of that for you. Here is a piece of XML and the code to explore it:

<node>
   <child>
      <subchild/>
   </child>
   <child2/>
</node>
xml_bind xb(xml_string);
xb.find_child_elem("node");$
{
    scoped_into si(xb);
    xb.find_child_elem("child");
    {
        scoped_into si(xb);
        x.find_child_elem("subchild");
    }
    xml.find_child_elem("child2");
}

As you see, you can intuitively traverse your XML tree using the cursor.

Iterating elements

You can iterate the elements or child elements using a while(next) semantic similar to .NET enumerators:

//reseting child cursor
reset_child();
// next_child returns false if the end was reach
while( xb.next_child())
{
    // getting the current node
    xml_node n( xb.get_current_child() );
}

Of course, you can do that for elements and the iteration can be made on the node name:

// iterating on node data
while( xb.next_child("data"))
...

Skippin nodes

When traversing the tree, you might want to skip some node types such as comment or DTD declarations: to do so, you can specify which node type should be skipped:

xb.skip_pi(); // skips pi declaration,
xb.skip_dtd(); // skips dtd declartion,
xb.skip_comment(false); // do not skip comments
xb.skip( node_cdata, true ); // skip cdata nodes

By default, no types are skipped.

Saving, restoring the cursor state

You can save and restore the cursor state:

xml_bind::state st( xb.get_state() );
// doing stuff on xb...
xb.set_state(st);

Appending data

Methods

Adding data at the current cursor position is done using the append_child_elem methods. These methods apply to single value or a range of iterators:

  • bool append_child_elem(LPCTSTR name_)

    Adds an empty node, named name_. Note that all methods accept LPCTSTR or STL string.

  • template<typename T>
    bool append_chid_elem(
             LPCTSTR name_, 
             T const& value_, 
             bool as_cdata_ = false)

    Adds a node named name_ and with value value_. If as_cdata_ is true, the value is added in a CDATA node. Note that T must support << with ostream.

  • template<typename T, typename Policy>
    bool append_chid_elem_p(
             LPCTSTR name_,
             T const& 
             value_, Policy const& 
             policy_, bool as_cdata_
             = false)

    Adds a node named name_ and with value_, converted to string by policy_( I will explain them below), as value_.

  • template<typename ContainerIterator>
    bool append_child_elem(
        LPCTSTR name_, 
        ContainerIterator begin_,
        ContainerIterator end_,
        bool as_cdata_ = false)

    Adds a node named name_, converts the range of data described by begin_ and end_ to string as the data.

  • template<typename ContainerIterator, typename Policy>
    bool append_child_elem_p(
        LPCTSTR name_, 
        ContainerIterator begin_,
        ContainerIterator end_,
            Policy const& policy_,
        bool as_cdata_ = false)

    Adds a node named name_, converts the range of data described by begin_ and end_ to string using policy_ as the data.

There are a number of remarks to be done on these methods:

  • CDATA: For each method, you can choose to add the data in a CDATA section or not, using as_cdata_ parameter (default is not). If the data is not added in a CDATA section, it is escaped (< to < etc...), otherwise it is added unchanged,
  • Policies: The conversion of the data to string are done using policies. You can provide your own policy in order to suit your needs. See Iobind ([3]) for more details.

Examples

  • Add a string

    xml.add_child_elem("name","string");
  • Add a double

    double d;
    xml.add_child_elem("name",d);
  • Add a string in base64

    xml.add_child_elem_p("name","string", to_base64_p );
  • Add a string as CDATA

    xml.add_child_elem("name","string", true);
  • Add a container of ints

    vector<int> v;
    xml.add_child_elem_p("name",v.begin(),v.end(), sequence_to_string_p);
  • Add a map

    map<int, string> m;
    xml.add_child_elem_p("name",m.begin(),m.end(), 
        sequence_to_string_p << pair_to_string_p);

Reading back data

Methods

Data can be retrieved using the get_child_data methods that take and transform the data from the current node:

  • template<typename T>
    bool get_child_data(T& value_)

    Reads the data of the current child node, transforms it to T and stores it into value_.

  • template<typename T, typename Policy>
    bool get_child_data_p(T& value_, Policy const& policy_)

    Reads the data of the current child node, transforms it to T using policy_ and stores it into value_.

Note that all these methods return true if successful, false otherwise. If you are looking for a specific node you can use find_get_data:

  • template<typename T>
    bool find_get_child_data(LPCTSTR name_,T& value_)

    Reads the data of the first node named name_, transforms it to T and stores it into value_.

  • template<typename T, typename Policy>
    bool find_get_child_data_p(LPCTSTR name_, T& value_, 
                                           Policy const& policy_)

    Reads the data of the first node named name_, transforms it to T using policy_ and stores it into value_.

Examples

  • Get value as string

    string s;
    xml.get_child_data(s);
  • Get value as float

    float f;
    xml.get_child_data(f);
  • Get value as pair<int,float>,

    pair<int,float> p;
    xml.get_child_data_p(
        p, 
        pair_from_string_p 
            << from<int>()
            >> from<float>()
        );
  • Get value as vector<float>

    vector<float> v;
    xml.get_child_data(
        v,
        sequence_from_string_p 
           << from<float>()
        );
  • find elem and retrieve value

    float f;
    xml.find_get_child_data("name",f);

Compiling XmlBind

You will need Boost, IoBind to use XMLBind:

You need to compile IoBind. The declaration of xml_bind is in iobind/xml/xml_bind.hpp. Note that it is in the iobind::xml namespace.

Links

History

  • 2-10-2003, big update:
    • integrated PugXml in IoBind, split the .h in .cpp, .h
    • adding skipping,
    • added STL string methods,
    • fixed tree traversal problems
  • 05-07-2003, initial release.

Reference

  1. PugXML - A Small, Pugnacious XML Parser by Kristen Wegner,
  2. XML class for processing and building simple XML documents by Ben Bryant
  3. IoBind, a serializer code factory. by Jonathan de Halleux
  4. CMarkupArchive, an extension to CMarkup by Jonathan de Halleux

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Jonathan de Halleux
Engineer
United States United States
Member
Jonathan de Halleux is Civil Engineer in Applied Mathematics. He finished his PhD in 2004 in the rainy country of Belgium. After 2 years in the Common Language Runtime (i.e. .net), he is now working at Microsoft Research on Pex (http://research.microsoft.com/pex).

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
QuestionHas any one know how to read attributes with xmlbind?memberMagicMD3 Apr '07 - 20:32 
Has any one know how to read attributes with xmlbind?
 
There are methods to write attributes but there are none for reading them
 
Has any one got ideas for this, or has already dones this?
 
Thanks
Magic

Generalmore examplesmemberfunvill12 Feb '04 - 20:42 
Hello
I was wondering if you could provide more examples on how to parse xml files
It would be nice to have a simple example of how to get the value of parm1 out of child id=1
 

<node>
<child id="1">
<parm1>AAA</parm1>
<parm2>BBB</parm2>
</child>
<child id="2">
<parm1>CCC</parm1>
<parm2>DDD</parm2>
</child>
</node>

 
Thank you for your time
 

 
---
www.funvill.com
QuestionML A bug?memberhihint11 Nov '03 - 20:45 
Hi, I got the lastest version of XmlBind from iobind.sourceforge.net and the xml_bind::next_child() seems does not work properly. So I make the following modification and it works for me. Am I right or still missing sth?
 
the code was
 
bool xml_bind::next_child(LPCTSTR name_)
{
m_current_child_elem.moveto_next_sibling(name_)
return skip_child_elems();
};
 
I changed to
 
bool xml_bind::next_child(LPCTSTR name_)
{
if ( m_current_child_elem.moveto_next_sibling(name_) ) // by Hai
return skip_child_elems();
return false;
};
 
The problem is m_current_child_elem.moveto_next_sibling(...) may already fail but the following skip_child_elems() do not address this failure.
 
hai
AnswerRe: ML A bug?sussAnonymous11 Nov '03 - 20:53 
In fact, it is a bug. Good job Smile | :)
GeneralRe: ML A bug?memberMagicMD30 Apr '07 - 16:53 
There are a few with the same problem and need the same sort of fix
 
next_child() and next_child(LPCTSTR name_)
first_child(LPCTSTR name_)
 
next_elem may be wrong as well
GeneralYou are having a big responsability on my programming style...memberJohn A. Johnson7 Oct '03 - 4:12 
... now I'm "streamming", "binding" and "boosting" everything! Poke tongue | ;-P
 
Thank you for sharing Rose | [Rose]
Great article!
 
Cool | :cool:
GeneralRe: You are having a big responsability on my programming style...memberJonathan de Halleux7 Oct '03 - 6:07 
cool feedback...
 
Don't hesitate to give your opinions about xml_bind. It is always a work in progress Smile | :)
 
Jonathan de Halleux.

www.pelikhan.com

GeneralReading XML from filesussSimon Riise13 Jun '03 - 3:29 
I am having some difficulties parsing an XML doc from a file.
 
Following your example I tried to just load in an XML file instead of building it up in c++. I used the parse_file method and the file seems to load ok, as a cerr on the object displays the xml... But when I try to traverse the XML I get a runtime reference error - it seems like I am missing a step in order to have the file properly parsed/loaded? Confused | :confused:
 
Regards,
Simon
GeneralRe: Reading XML from file -&gt; must set the cursormemberJonathan de Halleux13 Jun '03 - 4:41 
Humm, yes you load the file but the cursor is not set.
 
Add the following method:
void reset_elem()
{
   m_current_elem=document();
}
void reset_cursors()
{
   reset_elem();
   reset_child();
}
 
and call reset_cursors() after parse_file. Keep me updated if it works..
 
Jonathan de Halleux.

GeneralRe: Reading XML from file -&gt; must set the cursorsussSimon Riise13 Jun '03 - 6:13 
Hmm... I tried, but still the same error.
 
I found out, that if I do not "step into" the first tag with .into_elem() before searching (as you do in your example), then it does not give an error when searching with .find_get_child_data(...) - but it is not able to find my tags.
If I do use .into_elem() before searching, then I get the memory reference error (looks like a null pointer).
 
I will not be back until Monday - have a nice weekend!
 
Thanks,
Simon
GeneralRe: Reading XML from file -&gt; must set the cursormemberJonathan de Halleux2 Oct '03 - 23:55 
Did you try the latest version ?
 
Jonathan de Halleux.

GeneralRe: Reading XML from file -&gt; must set the cursormembermatom3 Nov '03 - 12:43 
I have also been having a similar problem and tried the solution you suggested to no avail. Is there any more information on how to solve this problem. It appears to be an un-initialised variable of some sort.
 
(I have the 1.2 release from sourceforge)
GeneralRe: Reading XML from file -&gt; must set the cursormemberJonathan de Halleux3 Nov '03 - 23:50 
Could you post the code that fails.
 
Jonathan de Halleux.

www.pelikhan.com

GeneralRe: Reading XML from file -&gt; must set the cursormembermatom7 Nov '03 - 6:39 
My apologies. In my then sleep deprived state I had missed that your suggested fix redefined reset_elem() and I had used the original. Once I realised that I got things to work. As pennance I will send you the work I did to reduce the size of the pugxml.hpp file to assist in speeding up compilations.
GeneralExcellent Idea, doubt in methodsmembervgrigor9 Jun '03 - 21:33 
What if your parser not reliable,- like MS?
Have no all needed ablities?
 
It is better to write parser- independent code -as a layer or
library on top of the parser, that can be substituted by
user's desire.
they'll be free for choice of parser,
and if your is better they 'll choose your.
 
Give a choise to the user.
 


GeneralRe: Excellent Idea, doubt in methodsmemberJonathan de Halleux9 Jun '03 - 21:46 
vgrigor wrote:
It is better to write parser- independent code -as a layer or
library on top of the parser, that can be substituted by
user's desire.

 
It is! You can substitute any parser by passing a custom policy to the method : (get_child_data_p). If this is not enough, then in fact it is useless.
 

 
Jonathan de Halleux.

GeneralRe: Excellent Idea, doubt in methodsmembervgrigor9 Jun '03 - 22:04 
It is so important aspect,
that samples of this is crucial,
 
Please (it is better) provide a sample with using Ms parser to faster learning curve,
than inventing simplest things again by users, (parsing in depth your library code insead of work)-
you better know how to -please help adequately in important aspect.
 

Installation rules was supplied?
 
Your Work is good-
it is better to accomplish good for real use.

GeneralRe: Excellent Idea, doubt in methodsmemberJonathan de Halleux9 Jun '03 - 22:10 
Sorry I had not understood the first thread: you meant replacing PugXML by MS parser... I'll have a look at that and update the article.
vgrigor wrote:
Please (it is better) provide a sample with using Ms parser to faster learning curve,
than inventing simplest things again by users, (parsing in depth your library code insead of work)-
you better know how to -please help adequately in important aspect.

 
?
 
Jonathan de Halleux.

GeneralRe: Excellent Idea, doubt in methodsmemberSGarratt3 Oct '03 - 9:12 
we stream gigs of xml through msxml and it never hiccups.
So what is your basis of saying it is not reliable ?

 
SGarratt
GeneralRe: Excellent Idea, doubt in methodssussAnonymous3 Oct '03 - 9:51 
I am not saying that MSXML is unreliable... I'm just saying it is tedious to write code that uses it.
GeneralRe: Excellent Idea, doubt in methodsmemberSGarratt3 Oct '03 - 12:07 
I'm responding to "vgrigor" who said:
>> What if your parser not reliable,- like MS?
"anonymous" said:
I am not saying that MSXML is unreliable..so:

if (("anonoymous" == "vgrigor") && bIgnoreSuddenImprovmentInEnglish)
cout << "yes - you directly contradicted yourself";
else
cout << "wrong message response error";
 
WTF | :WTF: forgive me... its Friday and I've had a long week OMG | :OMG:
 
SGarratt
GeneralRe: Excellent Idea, doubt in methodsmemberJonathan de Halleux6 Oct '03 - 20:05 
cout<<(anonymous == vgrigor ? "true": "false");
 
-- output
false
 
but
 
cout<<(anonymous == Jonathan de Halleux ? "true": "false");
 
-- output
true

 
Jonathan de Halleux.

www.pelikhan.com

GeneralFile not foundmemberMajid Shahabfar8 Jun '03 - 22:52 
Cannot open include file: 'boost/spirit.hpp'

GeneralRe: File not foundsussAnonymous9 Jun '03 - 0:07 
You need to install Boost 1.30.
 
See http://www.boost.org

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web01 | 2.6.130523.1 | Last Updated 3 Oct 2003
Article Copyright 2003 by Jonathan de Halleux
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid