Introduction
XmlBind is an XML I/O helper class. It is based on three ingredients:
- PugXML a great XML DOM parser available on CP (see [1]),
- EDOM, presented by firstobject.com, a simplified approach for creating and exploring XML files, (see [2]),
- IoBind, a library for serializing objects to/from strings, (see [3]),
The goal of XmlBind is to ease up adding and retrieving information in XML:
- Use PugXML a powerful, fast and complete XML class,
- Avoid DOM lengthy operations: add a node and value in one function call,
- Easy traversal of the DOM tree,
- Do not worry about conversion to/from string: let IoBind take care of it for you:
- STL container serialization,
- base64 conversion,
- XML reserved characters escaping,
- etc...
- Intelligent skipping of nodes.
Quick example
Suppose that you have a small structure that you want to be serialized in XML:
struct data_set
{
string name;
vector< pair<float,float> > points;
}
The XML output file would look like this:
<data_set>
<name>the name</name>
<points>(0,1),(2,3),...</points>
</data_set>
Classic solution
If you use DOM and do not have any helper to transform points
into a string
, this can be a lengthy job:
datat_set d;
xml_node node;
xml_node data_set_node=node.append_child( node_element );
data_set.name("data_set");
xml_node name_node=data_set_node.append_child( node_element );
name_node.name("name");
xml_node name_value_node=name_node.append_child( node_pcdata );
name_value_node.value( d.name );
ostringstream ouput;
output<<"("<<d.points[i].first<<":"<<d.points[i].second<<")";
for (size_t i = 0; i< d.points.size(); ++i)
output<<",("<<d.points[i].first<<":"<<d.points[i].second<<")";
xml_node points_node=data_set_node.append_child( node_element );
points_node.name("points");
xml_node points_value_node=points_node.append_child( node_pcdata );
points_value_node.value( output.str() );
This already seems long enough and I'm not speaking about parsing back the data.
XmlBind solution
XmlBind provides some handy and very customizable wrappers for writing data to XML:
xml_bind xb;
xb.add_child_elem("data_set");
{
scoped_into si(xb);
xb.add_child_elem("name",d.name);
xb.add_child_elem_p(
"points",
d.points.begin(),
d.points.end(),
sequence_to_string_p << pair_to_string_p );
}
As you can see, the XmlBind solution is much shorter and more intuitive. There are two factors for that:
- EDOM simplifies the creation of XML files,
- IoBind has been used to transform
points
into a string
Note also, that XmlBind gives you the tool to read back the data from the created XML document:
xml_bind xml(string);
if(!xb.find_child_elem("data_set"))
return false;
{
scoped_into si(xb);
xb.find_get_child_data("name",d.name);
xb.find_get_child_data_p(
"points",
d.points,
sequence_from_string_p
<< (
pair_from_string_p
<< from_string<float>()
>> from_string<float>()
)
}
This concludes this small example. Below you will understand how XmlBind works and how it can work for you.
XmlBind overview
XmlBind comes as a single wrapper around PugXML xml_parser
:
class xml_bind : public xml_parser
xml_bind
contains a "cursor" that stores the current node and the current child node. This cursor is used to:
- Add new nodes after the current node,
- Get the data of the current node,
- Traverse the XML tree.
xml_bind
provides several helper methods to append and retrieve data.
Tree traversal
Going up and down in the tree
As stated above, xml_bind
contains a cursor that can be used to explore the XML tree. This is the EDOM idea.
To go down into the tree depth, use into_elem()
and to go out use out_of_elem()
. As in the example, you can also use the helper class scoped_into
that takes care of that for you. Here is a piece of XML and the code to explore it:
<node>
<child>
<subchild/>
</child>
<child2/>
</node>
xml_bind xb(xml_string);
xb.find_child_elem("node");$
{
scoped_into si(xb);
xb.find_child_elem("child");
{
scoped_into si(xb);
x.find_child_elem("subchild");
}
xml.find_child_elem("child2");
}
As you see, you can intuitively traverse your XML tree using the cursor.
Iterating elements
You can iterate the elements or child elements using a while(next)
semantic similar to .NET enumerators:
reset_child();
while( xb.next_child())
{
xml_node n( xb.get_current_child() );
}
Of course, you can do that for elements and the iteration can be made on the node name:
while( xb.next_child("data"))
...
Skippin nodes
When traversing the tree, you might want to skip some node types such as comment or DTD declarations: to do so, you can specify which node type should be skipped:
xb.skip_pi(); xb.skip_dtd(); xb.skip_comment(false); xb.skip( node_cdata, true );
By default, no types are skipped.
Saving, restoring the cursor state
You can save and restore the cursor state:
xml_bind::state st( xb.get_state() );
xb.set_state(st);
Appending data
Methods
Adding data at the current cursor position is done using the append_child_elem
methods. These methods apply to single value or a range of iterators:
-
bool append_child_elem(LPCTSTR name_)
Adds an empty node, named name_
. Note that all methods accept LPCTSTR
or STL string
.
-
template<typename T>
bool append_chid_elem(
LPCTSTR name_,
T const& value_,
bool as_cdata_ = false)
Adds a node named name_
and with value value_
. If as_cdata_
is true
, the value is added in a CDATA
node. Note that T
must support <<
with ostream
.
-
template<typename T, typename Policy>
bool append_chid_elem_p(
LPCTSTR name_,
T const&
value_, Policy const&
policy_, bool as_cdata_
= false)
Adds a node named name_
and with value_
, converted to string
by policy_
( I will explain them below), as value_
.
-
template<typename ContainerIterator>
bool append_child_elem(
LPCTSTR name_,
ContainerIterator begin_,
ContainerIterator end_,
bool as_cdata_ = false)
Adds a node named name_
, converts the range of data described by begin_
and end_
to string as the data.
-
template<typename ContainerIterator, typename Policy>
bool append_child_elem_p(
LPCTSTR name_,
ContainerIterator begin_,
ContainerIterator end_,
Policy const& policy_,
bool as_cdata_ = false)
Adds a node named name_
, converts the range of data described by begin_
and end_
to string using policy_
as the data.
There are a number of remarks to be done on these methods:
CDATA
: For each method, you can choose to add the data in a CDATA
section or not, using as_cdata_
parameter (default is not). If the data is not added in a CDATA
section, it is escaped (< to < etc...), otherwise it is added unchanged,
- Policies: The conversion of the data to string are done using policies. You can provide your own policy in order to suit your needs. See Iobind ([3]) for more details.
Examples
-
Add a string
xml.add_child_elem("name","string");
-
Add a double
double d;
xml.add_child_elem("name",d);
-
Add a string in base64
xml.add_child_elem_p("name","string", to_base64_p );
-
Add a string as CDATA
xml.add_child_elem("name","string", true);
-
Add a container of int
s
vector<int> v;
xml.add_child_elem_p("name",v.begin(),v.end(), sequence_to_string_p);
-
Add a map
map<int, string> m;
xml.add_child_elem_p("name",m.begin(),m.end(),
sequence_to_string_p << pair_to_string_p);
Reading back data
Methods
Data can be retrieved using the get_child_data
methods that take and transform the data from the current node:
-
template<typename T>
bool get_child_data(T& value_)
Reads the data of the current child node, transforms it to T
and stores it into value_
.
-
template<typename T, typename Policy>
bool get_child_data_p(T& value_, Policy const& policy_)
Reads the data of the current child node, transforms it to T
using policy_
and stores it into value_
.
Note that all these methods return true
if successful, false
otherwise. If you are looking for a specific node you can use find_get_data
:
-
template<typename T>
bool find_get_child_data(LPCTSTR name_,T& value_)
Reads the data of the first node named name_
, transforms it to T
and stores it into value_
.
-
template<typename T, typename Policy>
bool find_get_child_data_p(LPCTSTR name_, T& value_,
Policy const& policy_)
Reads the data of the first node named name_
, transforms it to T
using policy_
and stores it into value_
.
Examples
-
Get value as string
string s;
xml.get_child_data(s);
-
Get value as float
float f;
xml.get_child_data(f);
-
Get value as pair<int,float>
,
pair<int,float> p;
xml.get_child_data_p(
p,
pair_from_string_p
<< from<int>()
>> from<float>()
);
-
Get value as vector<float>
vector<float> v;
xml.get_child_data(
v,
sequence_from_string_p
<< from<float>()
);
-
find elem and retrieve value
float f;
xml.find_get_child_data("name",f);
Compiling XmlBind
You will need Boost, IoBind to use XMLBind:
You need to compile IoBind. The declaration of xml_bind
is in iobind/xml/xml_bind.hpp. Note that it is in the iobind::xml
namespace.
Links
History
- 2-10-2003, big update:
- integrated PugXml in IoBind, split the .h in .cpp, .h
- adding skipping,
- added STL string methods,
- fixed tree traversal problems
- 05-07-2003, initial release.
Reference
- PugXML - A Small, Pugnacious XML Parser by Kristen Wegner,
- XML class for processing and building simple XML documents by Ben Bryant
- IoBind, a serializer code factory. by Jonathan de Halleux
- CMarkupArchive, an extension to CMarkup by Jonathan de Halleux
Jonathan de Halleux is Civil Engineer in Applied Mathematics. He finished his PhD in 2004 in the rainy country of Belgium. After 2 years in the Common Language Runtime (i.e. .net), he is now working at Microsoft Research on Pex (http://research.microsoft.com/pex).