Click here to Skip to main content
Click here to Skip to main content
Go to top

How to create a simple XML file using MSXML in C++

, 26 Oct 2009
Rate this:
Please Sign up or sign in to vote.
This article demonstrates the use of MSXML APIs using C++.

Introduction

This article demonstrates the use of MSXML APIs using C++ and creates a simple XML file.

Background

I wanted to write an XML file for my project using MSXML in C++. After Googling, I found some helpful links spread across various articles which gave me some starting point to proceed. I decide to write an article which will help a beginner to use MSXML and give the basics of XML, as there is not much available for this topic on C++.

XML Basics

An XML file is a well agreed data structure produced by the World Wide Web Consortium (W3C). It gives easy access to data in a structured way. More about XML can be found on the Wikipedia.

MSXML APIs

Microsoft XML APIs are independent of development environments. Most of the uses of MSXML in C# and Visual Basic can be found on the web, including the MSDN, but little is available for C++ developers.

I have created a demo project which will create an XML file having the following structure:

<?xml version="1.0" encoding="UTF-8"?>
<Parent Depth="0">
    <Child1 Depth="1">This is a child of Parent</Child1>
    <Child2 Depth="1">
        <Child3 Depth="2">
            <Child4 Depth="3">This is a child of Child3</Child4>
        </Child3>
    </Child2>
</Parent>

In MSXML, we create a root or parent element, and then go on inserting the child elements. Nodes can be anything. It can be an element node, it can be an attribute node etc. There are several types of nodes available. They are documented on MSDN. You can create a node by passing the appropriate flag, like NODE_ELEMENT, to the CreateNode() function. Another way is to use the CreatElement() function.

Now, enough on the background; let us come to the real job. I have created a dialog based application using Microsoft Visual Studio 2005.

Now, let us see the code and understand what is happening.

Using the Code

The first thing you need to do is add these two lines in the project stdafx.h file:

#import "MSXML4.dll" rename_namespace(_T("MSXML"))
#include <msxml2.h>

I am using msxml4.dll. As of now, msxml6.dll is available. The same code was tested on MSXML6 as well.

Rename_namespace(_T(“MSXML”)) renames the namespace to MSXML; otherwise, by default, its namespace will be MSXML2. Note that I have not used the “raw_interfaces_only” attribute because I want the compiler to generate the C++ smart pointer wrapper interfaces.

Now you need to call the AFxOleInit() function. The best place is the InitInstance() of the application class.

if (!AfxOleInit())
{
    AfxMessageBox(_T("Failed to initialize OLE library"));
    return FALSE;
}

This initializes all the OLE library stuff which includes the calls to ::CoInitialize() and ::CoUnInitialize() which are necessary for any COM object to be used.

Now have a look at the code inside OnBnClickedCreatexml(). Most of the code is self explanatory, but I will provide the details wherever necessary.

void CXMLDemoDlg::OnBnClickedCreatexml()
{
    //Create the XML
    MSXML::IXMLDOMDocument2Ptr pXMLDoc;    
    HRESULT hr = pXMLDoc.CreateInstance(__uuidof(DOMDocument40));
    if(FAILED(hr))
    {
            AfxMessageBox(_T("Failed to create the XML class instance"));
            return;
    }
    if(pXMLDoc->loadXML(_T("<Parent></Parent>")) == VARIANT_FALSE)
    {
        ShowError(pXMLDoc);
        return;
    }

    //Get the root element just created    
    MSXML::IXMLDOMElementPtr pXMLRootElem = pXMLDoc->GetdocumentElement();
    
    //Add an attribute
    pXMLRootElem->setAttribute(_T("Depth"),_variant_t(_T("0")));

    MSXML::IXMLDOMProcessingInstructionPtr pXMLProcessingNode =    
      pXMLDoc->createProcessingInstruction("xml", " version='1.0' encoding='UTF-8'");

    _variant_t vtObject;
    vtObject.vt = VT_DISPATCH;
    vtObject.pdispVal = pXMLRootElem;
    vtObject.pdispVal->AddRef();

    pXMLDoc->insertBefore(pXMLProcessingNode,vtObject);

    //Create the child elements and set the attributes    
    MSXML::IXMLDOMElementPtr pXMLChild1 = 
      pXMLDoc->createElement(_T("Child1")); //Create first child element
    pXMLChild1->setAttribute(_T("Depth"),_T("1"));
    pXMLChild1->Puttext(_T("This is a child of Parent"));    //Set the element value
    pXMLChild1 = pXMLRootElem->appendChild(pXMLChild1);

    MSXML::IXMLDOMElementPtr pXMLChild2 = pXMLDoc->createElement(_T("Child2"));
    pXMLChild2->setAttribute(_T("Depth"), _T("1"));
    pXMLChild2 = pXMLRootElem->appendChild(pXMLChild2);    //Child2 is a sibling of Child1

    MSXML::IXMLDOMElementPtr pXMLChild3 = pXMLDoc->createElement(_T("Child3"));
    pXMLChild3->setAttribute(_T("Depth"), _T("2"));
    pXMLChild3 = pXMLChild2->appendChild(pXMLChild3);    //Child3 is a direct child of Child2

    
    MSXML::IXMLDOMElementPtr pXMLChild4 = pXMLDoc->createElement(_T("Child4"));
    pXMLChild4->setAttribute(_T("Depth"), _T("3"));
    pXMLChild4->Puttext(_T("This is a child of Child3"));
    pXMLChild4 = pXMLChild3->appendChild(pXMLChild4);    //Child4 is a direct child of Child3

    
    // Format the XML. This requires a style sheet
    MSXML::IXMLDOMDocument2Ptr loadXML;
    hr = loadXML.CreateInstance(__uuidof(DOMDocument40));
    if(FAILED(hr))
    {
        ShowError(loadXML);
        return;
    }
    
    //We need to load the style sheet which will be used to indent the XMl properly.
    if(loadXML->load(variant_t(_T("StyleSheet.xsl"))) == VARIANT_FALSE)
    {
        ShowError(loadXML);
        return;
    }

    //Create the final document which will be indented properly
    MSXML::IXMLDOMDocument2Ptr pXMLFormattedDoc;
    hr = pXMLFormattedDoc.CreateInstance(__uuidof(DOMDocument40));

    CComPtr<IDispatch> pDispatch;
    hr = pXMLFormattedDoc->QueryInterface(IID_IDispatch, (void**)&pDispatch);
    if(FAILED(hr))
    {
        return;
    }

    _variant_t    vtOutObject;
    vtOutObject.vt = VT_DISPATCH;
    vtOutObject.pdispVal = pDispatch;
    vtOutObject.pdispVal->AddRef();

    //Apply the transformation to format the final document    
    hr = pXMLDoc->transformNodeToObject(loadXML,vtOutObject);

    //By default it is writing the encoding = UTF-16. Let us change the encoding to UTF-8

    // <?xml version="1.0" encoding="UTF-8"?>
    MSXML::IXMLDOMNodePtr pXMLFirstChild = pXMLFormattedDoc->GetfirstChild();
    // A map of the a attributes (vesrsion, encoding) values (1.0, UTF-8) pair
    MSXML::IXMLDOMNamedNodeMapPtr pXMLAttributeMap =  pXMLFirstChild->Getattributes();
    MSXML::IXMLDOMNodePtr pXMLEncodNode = pXMLAttributeMap->getNamedItem(_T("encoding"));    
    pXMLEncodNode->PutnodeValue(_T("UTF-8"));    //encoding = UTF-8

    UpdateData();    //Get the location
    if(sLocation.IsEmpty())    //User forgot to set the lcoation?
        sLocation = _T("Javed.xml");
    hr = pXMLFormattedDoc->save(sLocation.AllocSysString());
    if(FAILED(hr))
    {
        ShowError(pXMLFormattedDoc);
        return;
    }
    sLocation += _T(" created");
    AfxMessageBox(sLocation);
}

There are various steps to create a complete XML document.

MSXML::IXMLDOMDocument2Ptr pXMLDoc;    
HRESULT hr = pXMLDoc.CreateInstance(__uuidof(DOMDocument40));
if(FAILED(hr))
{
    AfxMessageBox(_T("Failed to create the XML class instance"));
    return;
}

The above code instantiates the MSXML object. Note that we have not called CoInitialize(NULL) here to initialize the COM libs because they are included in the AfxOleInit() function which we have already called.

if(pXMLDoc->loadXML(_T("<Parent></Parent>")) == VARIANT_FALSE)
{
    ShowError(pXMLDoc);
    return;
}

Here, the point worth mentioning is you have to create the starting node. May be the parent node or the root node (or an element in this case) is the best place to start. The above code does exactly the same thing. It is important to load the first element using LoadXML().

//Get the root element just created    
MSXML::IXMLDOMElementPtr pXMLRootElem = pXMLDoc->GetdocumentElement();

//Add an attribute
pXMLRootElem->setAttribute(_T("Depth"),_variant_t(_T("0")));

MSXML::IXMLDOMProcessingInstructionPtr pXMLProcessingNode = 
  pXMLDoc->createProcessingInstruction("xml", " version='1.0' encoding='UTF-8'");

_variant_t vtObject;
vtObject.vt = VT_DISPATCH;
vtObject.pdispVal = pXMLRootElem;
vtObject.pdispVal->AddRef();

pXMLDoc->insertBefore(pXMLProcessingNode,vtObject);

Now we want to insert <?xml version="1.0" encoding="UTF-8"?> just at the start of the XML file. The above code does exactly the same thing. IXMLDOMProcessingInstruction is an interface which deals with how XML files should be processed, like encoding details, version number etc. Here, we have created the processing node and inserted it just before the parent element.

//Create the child elements and set the attributes

//Create first child element
MSXML::IXMLDOMElementPtr pXMLChild1 = pXMLDoc->createElement(_T("Child1"));
pXMLChild1->setAttribute(_T("Depth"),_T("1"));
pXMLChild1->Puttext(_T("This is a child of Parent"));    //Set the element value
pXMLChild1 = pXMLRootElem->appendChild(pXMLChild1);

MSXML::IXMLDOMElementPtr pXMLChild2 = pXMLDoc->createElement(_T("Child2"));
pXMLChild2->setAttribute(_T("Depth"), _T("1"));
pXMLChild2 = pXMLRootElem->appendChild(pXMLChild2);
//Child2 is a sibling of Child1

MSXML::IXMLDOMElementPtr pXMLChild3 = pXMLDoc->createElement(_T("Child3"));
pXMLChild3->setAttribute(_T("Depth"), _T("2"));
pXMLChild3 = pXMLChild2->appendChild(pXMLChild3);
//Child3 is a direct child of Child2


MSXML::IXMLDOMElementPtr pXMLChild4 = pXMLDoc->createElement(_T("Child4"));
pXMLChild4->setAttribute(_T("Depth"), _T("4"));
pXMLChild4->Puttext(_T("This is a child of Child3"));
pXMLChild4 = pXMLChild3->appendChild(pXMLChild4);
//Child4 is a direct child of Child3

Now it’s time to create the complete XML including all the child elements. Be careful in assigning which is the parent and of which child, as shown above.

At this point, your XML content is done. You may want to save this into a file, so just call save().

pXMLdDoc->save(sLocation.AllocSysString());

But wait, if you save this, then open it in Notepad or in any non-HTML basic editor, it will look like this:

<?xml version="1.0" encoding="UTF-8"?>
<Parent Depth="0"><Child1 Depth="1">This is a child of Parent</Child1>
 <Child2 Depth="1"><Child3 Depth="2">
 <Child4 Depth="4">This is a child of Child3</Child4></Child3></Child2></Parent>

Line breaks were added to the above snippet to prevent scrolling.

Yes, not indented as it should be. All the elements will be in a single line. Obviously, you will not like this. Let us now format this properly. For this, you need to do transformation with a template style sheet. A style sheet is also XML only, but it has some scripting characters which will operate on your XML to get the desired output. I am using a style sheet named StyleSheet.xls which is kept in the project directory.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:output method="xml" indent="yes"/>
    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@* | node()"/>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

The code:

//We need to load the style sheet which will be used to indent the XMl properly.
if(loadXML->load(variant_t(_T("StyleSheet.xsl"))) == VARIANT_FALSE)
{
    ShowError(loadXML);
    return;
}

//Create the final document which will be indented properly
MSXML::IXMLDOMDocument2Ptr pXMLFormattedDoc;
hr = pXMLFormattedDoc.CreateInstance(__uuidof(DOMDocument40));

CComPtr<IDispatch> pDispatch;
hr = pXMLFormattedDoc->QueryInterface(IID_IDispatch, (void**)&pDispatch);
if(FAILED(hr))
{
    return;
}

_variant_t    vtOutObject;
vtOutObject.vt = VT_DISPATCH;
vtOutObject.pdispVal = pDispatch;
vtOutObject.pdispVal->AddRef();

//Apply the transformation to format the final document    
hr = pXMLDoc->transformNodeToObject(loadXML,vtOutObject);

Here we are loading the style sheet with the function load() and then doing a transformation with the original XML to get a formatted new XML document object.

//By default it is writing the encoding = UTF-16. Let us change the encoding to UTF-8

// <?xml version="1.0" encoding="UTF-8"?>
MSXML::IXMLDOMNodePtr pXMLFirstChild = pXMLFormattedDoc->GetfirstChild();
// A map of the a attributes (vesrsion, encoding) values (1.0, UTF-8) pair
MSXML::IXMLDOMNamedNodeMapPtr pXMLAttributeMap =  pXMLFirstChild->Getattributes();
MSXML::IXMLDOMNodePtr pXMLEncodNode = pXMLAttributeMap->getNamedItem(_T("encoding"));    
pXMLEncodNode->PutnodeValue(_T("UTF-8")); //encoding = UTF-8

Although I have used UTF-8 encoding in the XML creation, the resulting formatted XML is created by UTF-16. So I have changed the encoding to UTF-8. It is just a matter of replacing one attribute. This also shows how to manipulate an element attribute.

hr = pXMLFormattedDoc->save(sLocation.AllocSysString());

Finally, just save the XML to a file, and you are done.

Points of Interest

One important point on the same line is, if you create a large XML file, say 10 KB in size, using UTF-16, which does not have Unicode characters, it is a waste of resources. The same data created using UTF-8 will be approximately 5 KB, provided only ASCII charters are used. This is because UTF-8 consumes only one byte for ASCII characters whereas UTF-16 consumes 2 bytes. More about UTF is available here.

History

  • 27 October 2009, First created.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Javed Akhtar Ansari
Software Developer (Senior)
India India
Javed is software developer (Lead). He has been working on desktop software using C++\C# since 2005.

Comments and Discussions

 
Questionaccomplising the same thing for .html files (not .xml files) PinmemberMember 100534531-Aug-13 1:43 
QuestionLot of work to insert the XML declaration PinmemberDaleKing17-Oct-12 4:30 
QuestionI am using your code and some elements don't have an end tag - is it because I am using MSXML2? Pinmembershe-programmer27-Jul-11 23:51 
AnswerRe: I am using your code and some elements don't have an end tag - is it because I am using MSXML2? PinmemberJaved Akhtar Ansari28-Jul-11 7:46 
GeneralBSTR Issues... PinmemberCharlie Johnson15-Apr-11 2:45 
GeneralXML Reader PinmemberHarish Pulimi8-Dec-10 1:29 
GeneralMy vote of 5 PinmemberHarish Pulimi17-Nov-10 23:20 
GeneralGood Pinmemberloyal ginger28-Oct-09 3:38 
GeneralGood article PinmemberLaserson27-Oct-09 4:41 
GeneralSome points to Microsoft XML PinmemberKarstenK27-Oct-09 2:49 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web01 | 2.8.140922.1 | Last Updated 27 Oct 2009
Article Copyright 2009 by Javed Akhtar Ansari
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid