XmlBind: putting PugXML on steroïds !






3.36/5 (10 votes)
Jun 8, 2003
5 min read

133765

580
A mutant XML parser using IoBind, EDOM and PugXML.
Introduction
XmlBind is an XML I/O helper class. It is based on three ingredients:
- PugXML a great XML DOM parser available on CP (see [1]),
- EDOM, presented by firstobject.com, a simplified approach for creating and exploring XML files, (see [2]),
- IoBind, a library for serializing objects to/from strings, (see [3]),
The goal of XmlBind is to ease up adding and retrieving information in XML:
- Use PugXML a powerful, fast and complete XML class,
- Avoid DOM lengthy operations: add a node and value in one function call,
- Easy traversal of the DOM tree,
- Do not worry about conversion to/from string: let IoBind take care of it for you:
- STL container serialization,
- base64 conversion,
- XML reserved characters escaping,
- etc...
- Intelligent skipping of nodes.
Quick example
Suppose that you have a small structure that you want to be serialized in XML:
struct data_set
{
string name;
vector< pair<float,float> > points;
}
The XML output file would look like this:
<data_set>
<name>the name</name>
<points>(0,1),(2,3),...</points>
</data_set>
Classic solution
If you use DOM and do not have any helper to transform points
into a string
, this can be a lengthy job:
datat_set d;
xml_node node;
// creating data set node
xml_node data_set_node=node.append_child( node_element );
data_set.name("data_set");
// adding name
xml_node name_node=data_set_node.append_child( node_element );
name_node.name("name");
xml_node name_value_node=name_node.append_child( node_pcdata );
name_value_node.value( d.name );
// convertin points to string
ostringstream ouput;
output<<"("<<d.points[i].first<<":"<<d.points[i].second<<")";
for (size_t i = 0; i< d.points.size(); ++i)
output<<",("<<d.points[i].first<<":"<<d.points[i].second<<")";
// adding points to xml
xml_node points_node=data_set_node.append_child( node_element );
points_node.name("points");
xml_node points_value_node=points_node.append_child( node_pcdata );
points_value_node.value( output.str() );
This already seems long enough and I'm not speaking about parsing back the data.
XmlBind solution
XmlBind provides some handy and very customizable wrappers for writing data to XML:
xml_bind xb;
xb.add_child_elem("data_set");
{
// going deeper into the xml tree
// when si is destroyed, we go back
scoped_into si(xb);
// adding a child element + value
xb.add_child_elem("name",d.name);
// adding a child element + rendering values to string
xb.add_child_elem_p(
"points",
d.points.begin(),
d.points.end(),
sequence_to_string_p << pair_to_string_p );
}
As you can see, the XmlBind solution is much shorter and more intuitive. There are two factors for that:
- EDOM simplifies the creation of XML files,
- IoBind has been used to transform
points
into astring
Note also, that XmlBind gives you the tool to read back the data from the created XML document:
xml_bind xml(string);
if(!xb.find_child_elem("data_set"))
return false;
{
scoped_into si(xb);
xb.find_get_child_data("name",d.name);
xb.find_get_child_data_p(
"points",
d.points,
sequence_from_string_p
<< (
pair_from_string_p
<< from_string<float>()
>> from_string<float>()
)
}
This concludes this small example. Below you will understand how XmlBind works and how it can work for you.
XmlBind overview
XmlBind comes as a single wrapper around PugXML xml_parser
:
class xml_bind : public xml_parser
xml_bind
contains a "cursor" that stores the current node and the current child node. This cursor is used to:
- Add new nodes after the current node,
- Get the data of the current node,
- Traverse the XML tree.
xml_bind
provides several helper methods to append and retrieve data.
Tree traversal
Going up and down in the tree
As stated above, xml_bind
contains a cursor that can be used to explore the XML tree. This is the EDOM idea.
To go down into the tree depth, use into_elem()
and to go out use out_of_elem()
. As in the example, you can also use the helper class scoped_into
that takes care of that for you. Here is a piece of XML and the code to explore it:
<node>
<child>
<subchild/>
</child>
<child2/>
</node>
xml_bind xb(xml_string);
xb.find_child_elem("node");$
{
scoped_into si(xb);
xb.find_child_elem("child");
{
scoped_into si(xb);
x.find_child_elem("subchild");
}
xml.find_child_elem("child2");
}
As you see, you can intuitively traverse your XML tree using the cursor.
Iterating elements
You can iterate the elements or child elements using a while(next)
semantic similar to .NET enumerators:
//reseting child cursor
reset_child();
// next_child returns false if the end was reach
while( xb.next_child())
{
// getting the current node
xml_node n( xb.get_current_child() );
}
Of course, you can do that for elements and the iteration can be made on the node name:
// iterating on node data
while( xb.next_child("data"))
...
Skippin nodes
When traversing the tree, you might want to skip some node types such as comment or DTD declarations: to do so, you can specify which node type should be skipped:
xb.skip_pi(); // skips pi declaration,
xb.skip_dtd(); // skips dtd declartion,
xb.skip_comment(false); // do not skip comments
xb.skip( node_cdata, true ); // skip cdata nodes
By default, no types are skipped.
Saving, restoring the cursor state
You can save and restore the cursor state:
xml_bind::state st( xb.get_state() );
// doing stuff on xb...
xb.set_state(st);
Appending data
Methods
Adding data at the current cursor position is done using the append_child_elem
methods. These methods apply to single value or a range of iterators:
bool append_child_elem(LPCTSTR name_)
Adds an empty node, named
name_
. Note that all methods acceptLPCTSTR
or STLstring
.template<typename T> bool append_chid_elem( LPCTSTR name_, T const& value_, bool as_cdata_ = false)
Adds a node named
name_
and with valuevalue_
. Ifas_cdata_
istrue
, the value is added in aCDATA
node. Note thatT
must support<<
withostream
.template<typename T, typename Policy> bool append_chid_elem_p( LPCTSTR name_, T const& value_, Policy const& policy_, bool as_cdata_ = false)
Adds a node named
name_
and withvalue_
, converted tostring
bypolicy_
( I will explain them below), asvalue_
.template<typename ContainerIterator> bool append_child_elem( LPCTSTR name_, ContainerIterator begin_, ContainerIterator end_, bool as_cdata_ = false)
Adds a node named
name_
, converts the range of data described bybegin_
andend_
to string as the data.template<typename ContainerIterator, typename Policy> bool append_child_elem_p( LPCTSTR name_, ContainerIterator begin_, ContainerIterator end_, Policy const& policy_, bool as_cdata_ = false)
Adds a node named
name_
, converts the range of data described bybegin_
andend_
to string usingpolicy_
as the data.
There are a number of remarks to be done on these methods:
CDATA
: For each method, you can choose to add the data in aCDATA
section or not, usingas_cdata_
parameter (default is not). If the data is not added in aCDATA
section, it is escaped (< to < etc...), otherwise it is added unchanged,- Policies: The conversion of the data to string are done using policies. You can provide your own policy in order to suit your needs. See Iobind ([3]) for more details.
Examples
-
Add a
string
xml.add_child_elem("name","string");
-
Add a
double
double d; xml.add_child_elem("name",d);
-
Add a string in base64
xml.add_child_elem_p("name","string", to_base64_p );
-
Add a string as
CDATA
xml.add_child_elem("name","string", true);
-
Add a container of
int
svector<int> v; xml.add_child_elem_p("name",v.begin(),v.end(), sequence_to_string_p);
-
Add a
map
map<int, string> m; xml.add_child_elem_p("name",m.begin(),m.end(), sequence_to_string_p << pair_to_string_p);
Reading back data
Methods
Data can be retrieved using the get_child_data
methods that take and transform the data from the current node:
template<typename T> bool get_child_data(T& value_)
Reads the data of the current child node, transforms it to
T
and stores it intovalue_
.template<typename T, typename Policy> bool get_child_data_p(T& value_, Policy const& policy_)
Reads the data of the current child node, transforms it to
T
usingpolicy_
and stores it intovalue_
.
Note that all these methods return true
if successful, false
otherwise. If you are looking for a specific node you can use find_get_data
:
template<typename T> bool find_get_child_data(LPCTSTR name_,T& value_)
Reads the data of the first node named
name_
, transforms it toT
and stores it intovalue_
.template<typename T, typename Policy> bool find_get_child_data_p(LPCTSTR name_, T& value_, Policy const& policy_)
Reads the data of the first node named
name_
, transforms it toT
usingpolicy_
and stores it intovalue_
.
Examples
-
Get value as
string
string s; xml.get_child_data(s);
-
Get value as
float
float f; xml.get_child_data(f);
-
Get value as pair
<int,float>
,pair<int,float> p; xml.get_child_data_p( p, pair_from_string_p << from<int>() >> from<float>() );
-
Get value as
vector<float>
vector<float> v; xml.get_child_data( v, sequence_from_string_p << from<float>() );
-
find elem and retrieve value
float f; xml.find_get_child_data("name",f);
Compiling XmlBind
You will need Boost, IoBind to use XMLBind:
- Boost can be found at http://www.boost.org/
- IoBind can be found at http://iobind.sourceforge.net/
- PugXml is included in IoBind
You need to compile IoBind. The declaration of xml_bind
is in iobind/xml/xml_bind.hpp. Note that it is in the iobind::xml
namespace.
Links
History
- 2-10-2003, big update:
- integrated PugXml in IoBind, split the .h in .cpp, .h
- adding skipping,
- added STL string methods,
- fixed tree traversal problems
- 05-07-2003, initial release.