Click here to Skip to main content
15,867,453 members
Articles / Mobile Apps

Using MSXML to Read XML Documents

Rate me:
Please Sign up or sign in to vote.
4.40/5 (36 votes)
7 Jun 2003Public Domain2 min read 367.3K   4.7K   73   39
How to read XML documents using MSXML, in a modern C++/template manner

Introduction

Everyone needs to parse XML nowadays. I found it hard to find good example source code in C++ -- most of the code seemed written in an old-fashioned style without templates, or were aimed at C# or Visual Basic. Hence, this article provides an example.

Parsing is done using MSXML, and I use ATL "smart pointers" to avoid the need to manually release everything. Note that MSXML is Unicode, through and through. It's a big waste of effort trying to use it with multi-byte/ASCII.

The accompanying source code has project files for embedded Visual C++ (.vcw .vcp), Visual C++ .NET (.sln .vcproj) and Borland C++Builder5 (.bpr .bpf). But not for Visual C++6, since that didn't ship with recent-enough MSXML headers.

PocketPC considerations: I use XML to store my configuration files. They have grown to about 80k each, and on the PocketPC, it takes 2 seconds to parse them. Therefore, I actually parse it into a more efficient memory-block structure, and write this memory block to disk. That way, I only need to re-parse if there have been any changes.

Preliminaries

Setup depends on which development environment you're using:

  • Visual Studio .NET -- fine as it is
  • Borland C++ Builder -- under Project > Options > Directories, add ($BCB)\include\atl
  • eMbedded Visual C++ (EVC) -- download the free STL port made by Giuseppe Govi, and put it in a subdirectory "stl_eVC" of your project
C++
#include <windows.h>
#include <msxml.h>
#include <objsafe.h>
#include <objbase.h>
#include <atlbase.h>
#pragma warning( push )
#pragma warning( disable: 4018 4786)
#include <string>
#pragma warning( pop )
using namespace std;

(The warning-disabler is just for EVC, which generates spurious warnings otherwise.)

Also, CoInitializeEx(NULL,COINIT_MULTITHREADED); beforehand (normally at the start of WinMain), and CoUninitialize(); afterwards (normally at the end of WinMain).

Actually, CoInitialize(NULL) is easier when compiling for desktop win32, since it works on Win'95 and hence doesn't require you to define _WIN32_WINNT. But it's not available on PocketPC.

XML Parsing

This is how to load the XML document. It uses the magic of ATL's safe pointers, to avoid the need to Release() everything afterwards. (For simplicity, error-checking has been omitted.)

C++
CComPtr<IXMLDOMDocument> iXMLDoc;
iXMLDoc.CoCreateInstance(__uuidof(DOMDocument));
     
#ifdef UNDER_CE
// Following is a bugfix for PocketPC.
iXMLDoc->put_async(VARIANT_FALSE);
CComQIPtr<IObjectSafety,&IID_IObjectSafety> isafe(iXMLDoc);
if (iSafety) 
{ DWORD dwSupported, dwEnabled; 
  isafe->GetInterfaceSafetyOptions(IID_IXMLDOMDocument,
                                   &dwSupported,&dwEnabled);
  isafe->SetInterfaceSafetyOptions(IID_IXMLDOMDocument,
                                   dwSupported,0);
}
#endif

// Load the file. 
VARIANT_BOOL bSuccess=false;
// Can load it from a url/filename...
iXMLDoc->load(CComVariant(url),&bSuccess);
// or from a BSTR...
//iXMLDoc->loadXML(CComBSTR(s),&bSuccess);

// Get a pointer to the root
CComPtr<IXMLDOMElement> iRootElm;
iXMLDoc->get_documentElement(&iRootElm);

// Thanks to the magic of CComPtr, we never need call
// Release() -- that gets done automatically.

As for accessing the elements and iterating over them, I wrote a tiny helper class TElem. Here's the example XML document that I'll demonstrate it with:

XML
<?xml version="1.0" encoding="utf-16"?>
<root desc="Simple Prog">
  <text>Hello World</text>
    <layouts>
    <lay pos="15" bold="true"/>
    <layoff pos="12"/>
    <layin pos="17"/>
  </layouts>
</root>

And this is how to use TElem:

C++
TElem eroot(iRootElm);
wstring desc = eroot.attr(L"desc");
// returns "Simple Prog"

TElem etext = eroot.subnode(L"text");
wstring s = etext.val();
// returns "Hello World"
s = eroot.subval(L"text");
// This is a shorter way to achieve the same thing

TElem elays = eroot.subnode(L"layouts");
for (TElem e=elays.begin(); e!=elays.end(); e++)
{ int pos = e.attrInt(L"pos",-1);
  bool bold = e.attrBool(L"bold",false);
  // we suggest defaults, in case the attribute is missing
  wstring id = e.name();
  // returns "lay" or "layoff" or "layin"
}

Again, there's no need to release TElem - that's done automatically. The full list of methods in TElem:

C++
// TElem -- a simple class to wrap up IXMLDomElement
// and to iterate its children.

wstring TElem::name() const;
// in <item>stuff</item> it returns "item"

wstring TElem::val() const;
// in <item>stuff</item> it returns "stuff"

wstring TElem::attr(const wstring name) const;
// in <item name="hello">stuff</item> it returns "hello"
// int x=e.attrInt(L"a",2)
// bool b=e.attrBool(L"a",true),
// We supply defaults in case the attribute was absent.

TElem TElem::subnode(const wstring name) const;
// in <item><a>hello</a><name>there</name></item>
// it returns the TElem <name>there</name>

wstring TElem::subval(const wstring name) const;
// in <item><a>hello</a><name>there</name></item>
// it returns "there"

for (TElem c=e.begin(); c!=e.end(); c++) {...}
// iterates over the subnodes

Source Code for TElem

Note in this source code the use of CComPtr and CComQIPtr and CComBSTR. These are lovely "safe-pointers" provided by the ATL, and mean that we needn't bother with Release().

I'm a bit of a miser, and so included iterator functionality in TElem, rather than writing a separate TElemIterator class.

C++
struct TElem
{ CComPtr<IXMLDOMElement> elem;
  CComPtr<IXMLDOMNodeList> nlist; int pos; long clen;

  TElem() :
        elem(0), nlist(0), pos(-1), clen(0) {}
  TElem(int _clen) :
        elem(0),nlist(0),pos(-1),clen(_clen) {}
  TElem(CComPtr<IXMLDOMElement> _elem) :
        elem(_elem), nlist(0), pos(-1), clen(0) {get();}
  TElem(CComPtr<IXMLDOMNodeList> _nlist) :
        elem(0), nlist(_nlist), pos(0), clen(0) {get();}

  void get()
  { if (pos!=-1)
    { elem=0;
      CComPtr<IXMLDOMNode> inode;
      nlist->get_item(pos,&inode);
      if (inode==0) return;
      DOMNodeType type; inode->get_nodeType(&type);
      if (type!=NODE_ELEMENT) return;
      CComQIPtr<IXMLDOMElement> e(inode);
      elem=e;
    }
    clen=0; if (elem!=0)
    { CComPtr<IXMLDOMNodeList> iNodeList;
      elem->get_childNodes(&iNodeList);
      iNodeList->get_length(&clen);  
    }
  }
  //
  wstring name() const
  { if (!elem) return L"";
    CComBSTR bn; elem->get_tagName(&bn);
    return wstring(bn);
  }
  wstring attr(const wstring name) const
  { if (!elem) return L"";
    CComBSTR bname(name.c_str());
    CComVariant val(VT_EMPTY);
    elem->getAttribute(bname,&val);
    if (val.vt==VT_BSTR) return val.bstrVal;
    return L"";
  }
  bool attrBool(const wstring name,bool def) const
  { wstring a = attr(name);
    if (a==L"true" || a==L"TRUE") return true;
    else if (a==L"false" || a==L"FALSE") return false;
    else return def;
  }
  int attrInt(const wstring name, int def) const
  { wstring a = attr(name);
    int i, res=swscanf(a.c_str(),L"%i",&i);
    if (res==1) return i; else return def;
  }
  wstring val() const
  { if (!elem) return L"";
    CComVariant val(VT_EMPTY);
    elem->get_nodeTypedValue(&val);
    if (val.vt==VT_BSTR) return val.bstrVal;
    return L"";
  }
  TElem subnode(const wstring name) const
  { if (!elem) return TElem();
    for (TElem c=begin(); c!=end(); c++)
    { if (c.name()==name) return c;
    }
    return TElem();
  }
  wstring subval(const wstring name) const
  { if (!elem) return L"";
    TElem c=subnode(name);
    return c.val();
  }
  TElem begin() const
  { if (!elem) return TElem();
    CComPtr<IXMLDOMNodeList> iNodeList;
    elem->get_childNodes(&iNodeList);
    return TElem(iNodeList);
  }
  TElem end() const
  { return TElem(clen);
  }
  TElem operator++(int)
  { if (pos!=-1) {pos++; get();}
    return *this;
  }
  bool operator!=(const TElem &e) const
  { return pos!=e.clen;
  }
};

License

This article, along with any associated source code and files, is licensed under A Public Domain dedication


Written By
Technical Lead
United States United States
Lucian studied theoretical computer science in Cambridge and Bologna, and then moved into the computer industry. Since 2004 he's been paid to do what he loves -- designing and implementing programming languages! The articles he writes on CodeProject are entirely his own personal hobby work, and do not represent the position or guidance of the company he works for. (He's on the VB/C# language team at Microsoft).

Comments and Discussions

 
GeneralCoCreateInstance is not able to create an instance of DOMDocument in Windows Mobile 6 Pin
Member 42017413-Oct-08 3:29
Member 42017413-Oct-08 3:29 
GeneralRe: CoCreateInstance is not able to create an instance of DOMDocument in Windows Mobile 6 Pin
Member 42017415-Oct-08 19:57
Member 42017415-Oct-08 19:57 
GeneralTrying to work this with Pocket 2003. Pin
Jiwan_a18-Jun-08 9:12
Jiwan_a18-Jun-08 9:12 
GeneralWon't compile for SmartPhone 2002 device Pin
ppcinfo21-Jan-07 8:44
ppcinfo21-Jan-07 8:44 
Questionhow do i add msxml to my installer ? Pin
code4jigar5-Sep-06 2:15
code4jigar5-Sep-06 2:15 
Questionhow to read binaries from xml on windows 98 Pin
kcselvaraj3-Jul-06 22:07
kcselvaraj3-Jul-06 22:07 
QuestionFailing to load an XML file! Pin
iskender.ryskulov@encos.com1-May-06 17:57
iskender.ryskulov@encos.com1-May-06 17:57 
AnswerRe: Failing to load an XML file! Pin
raster_blaster28-Jun-06 13:22
raster_blaster28-Jun-06 13:22 
GeneralDo a little research Pin
KarstenK10-Apr-06 0:51
mveKarstenK10-Apr-06 0:51 
GeneralRe: Do a little research Pin
ljw100410-Apr-06 5:12
ljw100410-Apr-06 5:12 
GeneralNeed urgent reply. Pin
PremViji26-Mar-06 18:21
PremViji26-Mar-06 18:21 
QuestionNeed Help to use this parser Pin
PremViji23-Mar-06 19:43
PremViji23-Mar-06 19:43 
GeneralI must add my thanks! Pin
Peter Weyzen3-Jan-06 22:07
Peter Weyzen3-Jan-06 22:07 
GeneralThank you for greatsolution! Pin
Michael Fleetwood11-Oct-05 6:56
Michael Fleetwood11-Oct-05 6:56 
Generalneed help Pin
udaysai4-Aug-05 20:58
udaysai4-Aug-05 20:58 
GeneralRe: need help Pin
udaysai4-Aug-05 21:07
udaysai4-Aug-05 21:07 
GeneralUse a XML file Pin
momo720-May-05 3:51
momo720-May-05 3:51 
GeneralRe: Use a XML file Pin
Miguel Garrido23-May-05 5:40
Miguel Garrido23-May-05 5:40 
GeneralImproved TElem Pin
Miguel Garrido25-Mar-05 5:40
Miguel Garrido25-Mar-05 5:40 
GeneralRe: Improved TElem Pin
DaFonz1-Aug-05 10:27
DaFonz1-Aug-05 10:27 
GeneralMost Excellent Example. Pin
JuanValdez17-Sep-04 7:30
JuanValdez17-Sep-04 7:30 
GeneralRe: Most Excellent Example. Pin
JuanValdez18-Sep-04 21:47
JuanValdez18-Sep-04 21:47 
GeneralWell Done Pin
beejoy2-Jun-04 15:57
beejoy2-Jun-04 15:57 
GeneralError under EVC 4.0 / Pocket PC 2003 Pin
Ishan Mehta28-Jul-03 21:10
Ishan Mehta28-Jul-03 21:10 
GeneralTElem nu=e where e = CComQIPtr&lt;IXMLDOMElement&gt; Pin
bobbino29-Jun-03 21:18
bobbino29-Jun-03 21:18 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.