Click here to Skip to main content
Click here to Skip to main content

C++ Wrappers for the Expat XML Parser

By , 17 Feb 2002
 

Introduction

The Expat XML Parser is a fine and widely used event based XML parser.  One of the nicer features of Expat is that it has an API capable of being used by C programs.  Even though many programmers use Expat in a C++ environment, the C based API makes it easy to export this API from a DLL.

However, Expat being a C based API doesn't mean we have to live without our C++ classes.  Luckily, Expat was designed with the ability to be augmented with classes.

(Definition: Event Based XML Parser - An XML parser which invokes methods (a.k.a. events) when XML constructs are parsed.  This differs from the DOM (Document Object Model) style parsers that parse the XML and then present the application with XML data in its logical hierarchical format.)

Design Rational

The primary considerations when designing the Expat wrapper classes was completeness, simplicity, and extensibility.  For completeness, almost all Expat API routines have been wrapped in the classes.  This includes even API such as XML_ExpatVersionInfo.  For simplicity, the wrapper classes only wrap the Expat API and provide no other features.  For extensibility, the wrapper classes make it easy to derive new classes the provide enhanced functionality.

Basics

This Expat wrappers consist of 2 classes, a template based class (CExpatImpl <class _T>) and a virtual function based class (CExpat).  Each class has features the lend themselves to specific solutions.

The following table illustrates the relationship between the API and the two classes.

CExpat

CExpatImpl <class _T>

Expat C API

The template class CExpatImpl <class _T> provides the base layer of translation between C++ and the Expat C API.  The benefit to the template designed is that if the application only needs a few of the Expat event routines, then the code for the event routines are not compiled into the final executable.  Admittedly, the amount of space wasted is minimal, but why waste it.

The CExpat class is derived from the CExpatImpl <class _T> template class.  However, excluding the default constructor, the only methods contained within this class are all the event methods declared as virtual functions.  CExpat is intended for situations where virtual functions are more preferable than templates.

Within reason, the two classes are interchangeable.  If you have a class that is derived from CExpat, it could be easily modified to use CExpatImpl <class _T> or visa-versa without having to modify any other source.  See the "Implementation Notes" for more information about some implementation pitfalls with regard to more complex derived classes.

For the rest of this document, only the CExpatImpl <class _T> class will be discussed.  As stated previously, the two wrapper classes are almost 100 percent interchangeable.  Documenting both would be redundant.

Getting Started

The first step in using CExpatImpl <class _T> is deriving a new class that will provide the application specific implementation.  Deriving a class is required.  Like Expat, if there is no derived class then Expat would only verify that the XML is well formed.

As a starting point, let us define an XML parser that will display when an element begins, ends, and the data contained within the element.

class CMyXML : public CExpatImpl <CMyXML> 
{
public:

	// Constructor 
	
	CMyXML () 
	{
	}
	
	// Invoked by CExpatImpl after the parser is created
	
	void OnPostCreate ()
	{
		// Enable all the event routines we want
		EnableStartElementHandler ();
		EnableEndElementHandler ();
		// Note: EnableElementHandler will do both start and end
		EnableCharacterDataHandler ();
	}
	
	// Start element handler

	void OnStartElement (const XML_Char *pszName, const XML_Char **papszAttrs)
	{
		printf ("We got a start element %s\n", pszName);
		return;
	}

	// End element handler

	void OnEndElement (const XML_Char *pszName)
	{
		printf ("We got an end element %s\n", pszName);
		return;
	}

	// Character data handler

	void OnCharacterData (const XML_Char *pszData, int nLength)
	{
		// note, pszData is NOT null terminated
		printf ("We got %d bytes of data\n", nLength);
		return;
	}
};

The CMyXML::OnPostCreate method will be invoked by CExpatImpl <class _T> after the Expat parser has been created.  This provides an easy method of enabling event routines.  The CMyXML::OnStartElement, CMyXML::OnEndElement, and CMyXML::OnCharacterData methods will be invoked by Expat while the XML text is being parsed.  These routines will not be invoked unless they are enabled.  The code inside CMyXML::OnPostCreate enables the three event routines.

Creating a Parser

Now that we have a derived class, we can use it to create an Expat parser.  Creating the parser is very easy.  First create an instance of the parser class, then invoke the Create method. 

The Create method has two arguments, the document encoding and the character used to separate namespaces a name.  The encoding is the default encoding that will be used while parsing the XML document unless an encoding is specified by in the XML document itself.  The namespace separator is used to separate the namespace from the name in calls such as OnStartElement

For example, if in the XML document there was the name SOAP_ENC:Envelope, the SOAP_ENC was defined as being "http://schemas.xmlsoap.org/soap/envelope/" and "#" was specified to Create, then OnStartElement would be invoked with the string "http://schemas.xmlsoap.org/soap/envelope/#Envelope".

bool ParseSomeXML (LPCTSTR pszXMLText)
{
	CMyXML sParser;
	sParser .Create ();
	
	// do something useful
}

Parsing a Simple Text String

Next, we actually need to send the XML document to the parser.  There are two different methods of sending the document to the XML parser, directly or by internal buffers.  The easier of the two is sending the data directly to the parser.  However, it is also just a bit slower.

To send a simple string to the parser, the application invokes the Parse (LPCTSTR pszBuffer, int nLength = -1, bool fIsFinal = true) method.  The first argument is a pointer to a string of data to be parsed.  A routine has been defined for both ANSI and UNICODE strings.  The second parameter is the length of the string in characters (char or wchar_t depending on ANSI or UNICODE).  If nLength is less than zero, then it is required that the string pointed to by pszBuffer is a NUL terminated string and the length will be determined from the string.  If nLength is greater or equal to zero, then the string need not be NUL terminated and the length shouldn't include the NUL character if it exists.  The third parameter lets the XML parser know when there is no more data.  If the whole XML document can be contained within one simple string, then fIsFinal can be set to true the first time.  Otherwise, fIsFinal should remain false while there is more data to be parsed.  Parse can be invoked with a nLength set to zero and fIsFinal set to true after all data has been read in.

bool ParseSomeXML (LPCTSTR pszXMLText)
{
	CMyXML sParser;
	sParser .Create ();
	
	// Send this simple string to the parser
	
	return sParser .Parse (pszXMLText);
}

Parsing Using Internal Buffers

To reduce the number of extra memory copies, buffers internal to the Expat parser can be used instead of passing data into the parser just to have the Expat parser copy the data to internal buffers.  Using internal buffers takes 3 steps, requesting a buffer, reading data into the buffer, submitting the data to the parser. 

bool ParseSomeXML (LPCSTR pszFileName)
{

	// Create the parser 
	
	CMyXML sParser;
	if (!sParser .Create ())
		return false;
	
	// Open the file
	
	FILE *fp = fopen (pszFileName, "r");
	if (fp == NULL)
		return false;
	
	// Loop while there is data
	
	bool fSuccess = true;
	while (!feof (fp) && fSuccess)
	{
		LPSTR pszBuffer = (LPSTR) sParser .GetBuffer (256); // REQUEST
		if (pszBuffer == NULL)
			fSuccess = false;
		else
		{
			int nLength = fread (pszBuffer, 1, 256, fp); // READ
			fSuccess = sParser .ParseBuffer (nLength, nLength == 0); // PARSE
		}	
	}

	// Close the file
	
	fclose (fp);
	return fSuccess;
}

As you can see, this method is more complicated that the other, but when you modify the example in the previous section to read a file, the differences in complexity are minimal.

Working With Event Routines

Event routines provide the actual information about what has been parsed to the application.  The method names inside the CExpatImpl <class _T> class have been selected to make it easy to know which routine applies to what Expat event.

In Expat:

Set the event handler routine XML_Set[Event Name]Handler
Name of the event handler Application specific

In CExpatImpl <class _T>

Enable the event handler routine Enable[Event Name]Handler
Name of the event handler On[Event Name]
Name of the internal event handler [Event Name]Handler

So, if you wish to receive StartElement events, you define a method called OnStartElement with the proper arguments and invoke EnableStartElementHandler with a true for the only argument.  The event routine can be later disabled by invoking EnableStartElementHandler again with false as the only argument.

The specifics about each of the event routines is beyond the scope of this document.  For more information about the events and the Expat parser itself, see http://www.xml.com/pub/a/1999/09/expat/index.html.  The most all information contained within this document has a counterpart of the same name in CExpatImpl <class _T>.

Implementation Notes

As stated earlier, there are some pitfalls applications will have to be aware of when creating complex derived class hierarchies.  Let us consider the example of an XML parser consisting of two classes, CMyXMLBase and CMyXMLCMyXML is derived from CMyXMLBase and CMyXMLBase is derived from one of the Expat class wrappers.

Consider the case where the classes are derived from the CExpatImpl <class _T> template class.

class CMyXMLBase : public CExpatImpl <CMyXMLBase> 
{
public:

	CMyXMLBase () 
	{
	}
	
	void OnStartElement (const XML_Char *pszName, const XML_Char **papszAttrs) 
	{
		// do useful stuff here... 
		return;
	}
};

class CMyXML : public CMyXMLBase
{ 
public:

	CMyXML ()
	{
	}
	
	void OnStartElement (const XML_Char *pszName, const XML_Char **papszAttrs) 
	{
		// do derived useful stuff here...
		return;
	}
};

In this case, the programmer expects the OnStartElement to be invoked by the Expat parser.  However, due to the design of the CExpatImpl <class _T> class, only the methods of the class specified in the template argument list would be invoked.  This is by design.

There are three different way to fix this problem.  The first method would be to declare OnStartElement as being virtual in CMyXMLBase.  The second would be to derive CMyXMLBase from CExpat instead of CExpatImpl <class _T>.  The third method requires the changing of CMyXMLBase from a normal class to a template.  This change provides CExpatImpl <class _T> with the name of the class from which to locate the event routines.

template <class _T>
class CMyXMLBase : public CExpatImpl <_T> 
{
public:

	CMyXMLBase () 
	{
	}
	
	void OnStartElement (const XML_Char *pszName, const XML_Char **papszAttrs) 
	{
		// do useful stuff here... 
		return;
	}
};

class CMyXML : public CMyXMLBase <CMyXML>
{ 
public:

	CMyXML ()
	{
	}
	
	void OnStartElement (const XML_Char *pszName, const XML_Char **papszAttrs) 
	{
		// do derived useful stuff here...
		return;
	}
};

About the Author

Tim has been a professional programmer for way too long.  He currently works at a company he co-founded that specializes in data acquisition software for industrial automation. 

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Tim Smith
Web Developer
Canada Canada
Member
Currently I am working in the tools department at BioWare Corp..

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
Questionlicense for this file please?memberchokmah28 Jul '09 - 14:53 
Hello
 
Can you send me a copy of the license for this file? Please.
Do the same MIT license as The Expat Parser?
 

Thanks.
Chokmah Chung
Generalneed copy of license for this filememberMichelle Kolkey3 Oct '08 - 7:05 
Tim Smith, can you send me a copy of the license for this file please?
 
thanks,
Michelle Kolkey
QuestionRe: need copy of license for this filememberStefan Schunck2 Dec '08 - 0:05 
Hi,
 
should everybody ask for a license individually, or could the license be posted for everybody?
 
thanks,
Stefan
GeneralRe: Professionalmemberjosemaocu22 Aug '08 - 4:25 
Very professional!
 
Thanks a lot.
Kind Regards.
tutoriales photoshop
tutoriales photoshop
GeneralRe: ProfessionalmemberJorge Bay Gondra19 Sep '08 - 0:43 
Thanks a lot!!!
5! Congrats!
Regards,
Jorge
 
diseño web
GeneralHandling Special Characters in input Xml [modified]membersachin_chakote3 Jul '08 - 23:38 
Can Expat handle special characters in input XML
E,g,
 
if

Sachin & sachin

 
is sample input file then the chardata handler does not give "sachin & sachin " as element data.
 
even though i replace with "sachin & sachin "

Sachin & sachin

 
it does not give me correct value for chardatahandler ..
 
how does libexpat handle special char ?
 
modified on Friday, July 4, 2008 6:02 AM

General[Message Removed]membernompel1 Oct '08 - 7:37 
Spam message removed
Questionhow does expat xml parser parse a xml field whose value is placed in double quotes.memberJeevan Reddy24 Apr '08 - 4:10 
Hi All,
 
I have a query regarding the way expat xml parses the xml attribute value. How does expat xml parser parse the attribute whose value is placed in double quotes.
 
for example :
 
<phone1>
<reg reg.1.displayName=""JReddy"" reg.1.address="2264" reg.1.label="2264"
</phone1>
 
In the above example the field reg.1.displayName has the value "JReddy"(the value is stored in double quotes).
 
In the above case how does the value after being parsed by expat parser look like? I have observed that the field after parsing appears like
";JReddy";
 
Could somebody please kindly let me know why the expat parser parses in that manner? Is it correct? Why does it put semicolon after the double quotes? Please forgive me if the question is very basic but do reply to the query.
GeneralPortable version of ExpatImplmemberVolkerB8 Jun '07 - 23:44 
Expat impl works fine in Visual Studio environments. With some minor changes it will be portable and can be used also with gcc in Unix/Linux environments:
Line 442 : XML_Expat_Version v = XML_ExpatVersionInfo (); // upper case E and V
Line 587 : static void XMLCALL StartElementHandler (void *pUserData,
until
Line 716 : replace __cdecl by XMLCALL
Line 742 : Add a virtual destructor virtual ~CExpat (){} // avoids warning from gcc
Line 860 : Add a new line at the end of the file
 
No enjoy ExpatImpl in other OS or with other compilers!
Volker.
GeneralRe: Portable version of ExpatImpl [modified]memberjkgoya16 Jul '11 - 10:20 
Thanks, that cleared most everything up. As a newbie I'd like to point out that it's also necessary to include
 
#include <cstring>
 
EDIT: also need to configure Eclipse to use the expat linker. That took me forever to figure out.
 
http://www.rose-hulman.edu/class/csse/resources/Eclipse/eclipse-c-configuration.htm[^]

modified on Saturday, July 16, 2011 7:58 PM

General10xmemberatanasf18 Oct '06 - 13:48 
Expat and this wrapper worked great, and were very easy to implement. Thanks.
Generallnnk error : LNK2019memberYang-Seok Yoon4 Aug '06 - 15:51 
Thank you for offering expat.
I'm developing the XML Parser using expat.
But, I don't enderstand a error about unresolved external symbol.
The error message is following:
 
>XmlParser.obj : error LNK2019: unresolved external symbol "public: void __thiscall CXmlParser::OnEndElement(char const *)" (?OnEndElement@CXmlParser@@QAEXPBD@Z) referenced in function "protected: static void __cdecl CExpatImpl::EndElementHandler(void *,char const *)" (?EndElementHandler@?$CExpatImpl@VCXmlParser@@@@KAXPAXPBD@Z)
1>.\Debug/XMLParser.exe : fatal error LNK1120: 1 unresolved externals

 
CXmlParser was inherited by CExpatImpl.
I added libexpat.dll as library and ExpatImpl.h to MSDEV.
If I try to solve this problem, what file (that includes method) should I add?
 
The source code that be based expat is following:

#include "ExpatImpl.h"
 
class CXmlParser : public CExpatImpl
{
public:
CXmlParser();
virtual ~CXmlParser();
void OnPostCreate();
void OnCharacterData(const XML_Char *pszData, int nLength);
void OnStartElement(const XML_Char* pszName, const XML_Char** papszAttrs);
void OnEndElement(const XML_Char *pszName);
}
 
CXmlParser::~CXmlParser()
{
// Disable all the event routines we want
EnableStartElementHandler (0);
// Disable the end element handler
EnableEndElementHandler (0);
// Note: DisableElementHandler will do start
EnableCharacterDataHandler (0);

Destroy();
}
 
void CXmlParser::OnPostCreate ()
{
// Enable all the event routines we want
EnableStartElementHandler (1);
//Enable the end element handler
EnableEndElementHandler (1);
// Note: EnableElementHandler will do start
EnableCharacterDataHandler (1);
}
 
void CXmlParser::OnStartElement(const XML_Char* pszName, const XML_Char** papszAttrs)
{...}
void CXmlParser::OnEndElement(const XML_Char *pszName)
{...}
void CXmlParser::OnCharacterData(const XML_Char *pszData, int nLength)
{...}

GeneralLicensememberPriyank Bolia18 Sep '05 - 2:12 
you use BSD license, but there is no license written in code, or with code. Only your copyright information, how a professional developers proceed with verbal license to use.
 
http://www.priyank.in/
GeneralLinker errors toomembertombox25 Jun '05 - 1:46 
Hello,
I tried to get ExpatImpl working in a library project. Since I received lots of linker errors I took the example from the CodeProject site and compiled it with MS VC++ .NET. Before that I build a static version of the expat library using the MS project files shipped with the expat files taken from sourceforge.
I called the compiler with the following command line:
cl /MT /Iexpat-1.95.8\lib expattest.cpp /link libexpatMT.lib
 
But I still get linker errors (see below). What is wrong? Is it a problem with the calling convention? Threading? C vs. C++? Where does this __imp__ prefix come from?
Any help would be very much appreciated.
Regards
Thomas
 

C:\Expat_Test>cl /MT /Iexpat-1.95.8\lib expattest.cpp /link libexpatMT.lib
 
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.10.3077 for 80x86
Copyright (C) Microsoft Corporation 1984-2002. All rights reserved.
 
expattest.cpp
Microsoft (R) Incremental Linker Version 7.10.3077
Copyright (C) Microsoft Corporation.  All rights reserved.
 
/out:expattest.exe
libexpatMT.lib
expattest.obj
expattest.obj : error LNK2019: unresolved external symbol __imp__XML_SetUserData referenced in function "public: bool __thiscall CExpatImpl<class CMyXML>::Create(char const *,char const *)" (?Create@?$CExpatImpl@VCMyXML@@@@QAE_NPBD0@Z)
expattest.obj : error LNK2019: unresolved external symbol __imp__XML_ParserCreate_MM referenced in function "public: bool __thiscall CExpatImpl<class CMyXML>::Create(char const *,char const *)" (?Create@?$CExpatImpl@VCMyXML@@@@QAE_NPBD0@Z)
expattest.obj : error LNK2019: unresolved external symbol __imp__XML_ParseBuffer referenced in function "public: bool __thiscall CExpatImpl<class CMyXML>::ParseBuffer(int,bool)" (?ParseBuffer@?$CExpatImpl@VCMyXML@@@@QAE_NH_N@Z)
expattest.obj : error LNK2019: unresolved external symbol __imp__XML_GetBuffer referenced in function "public: void * __thiscall CExpatImpl<class CMyXML>::GetBuffer(int)" (?GetBuffer@?$CExpatImpl@VCMyXML@@@@QAEPAXH@Z)
expattest.obj : error LNK2019: unresolved external symbol __imp__XML_SetStartElementHandler referenced in function "public: void __thiscall CExpatImpl<class CMyXML>::EnableStartElementHandler(bool)" (?EnableStartElementHandler@?$CExpatImpl@VCMyXML@@@
@QAEX_N@Z)
expattest.obj : error LNK2019: unresolved external symbol __imp__XML_SetEndElementHandler referenced in function "public: void __thiscall CExpatImpl<class CMyXML>::EnableEndElementHandler(bool)" (?EnableEndElementHandler@?$CExpatImpl@VCMyXML@@@@QAEX_
N@Z)
expattest.obj : error LNK2019: unresolved external symbol __imp__XML_SetCharacterDataHandler referenced in function "public:void __thiscall CExpatImpl<class CMyXML>::EnableCharacterDataHandler(bool)" (?EnableCharacterDataHandler@?$CExpatImpl@VCMyXML
@@@@QAEX_N@Z)
expattest.obj : error LNK2019: unresolved external symbol __imp__XML_ParserFree referenced in function "public: void __thiscall CExpatImpl<class CMyXML>::Destroy(void)" (?Destroy@?$CExpatImpl@VCMyXML@@@@QAEXXZ)
expattest.exe : fatal error LNK1120: 8 unresolved externals
 

expattest.cpp is:
 

#include
//#define EXPATCALL __cdecl
//#define XMLCALL __cdecl
 
#include "ExpatImpl.h"
 
class CMyXML : public CExpatImpl
{
public:
 
// Constructor
 
CMyXML ()
{
}
 
// Invoked by CExpatImpl after the parser is created
 
void OnPostCreate ()
{
// Enable all the event routines we want
EnableStartElementHandler ();
EnableEndElementHandler ();
// Note: EnableElementHandler will do both start and end
EnableCharacterDataHandler ();
}
 
// Start element handler
 
void OnStartElement (const XML_Char *pszName, const XML_Char **papszAttrs)
{
printf ("We got a start element %s\n", pszName);
return;
}
 
// End element handler
 
void OnEndElement (const XML_Char *pszName)
{
printf ("We got an end element %s\n", pszName);
return;
}
 
// Character data handler
 
void OnCharacterData (const XML_Char *pszData, int nLength)
{
// note, pszData is NOT null terminated
printf ("We got %d bytes of data\n", nLength);
return;
}
};
 

 

bool ParseSomeXML (const char* pszFileName)
{
 
// Create the parser
 
CMyXML sParser;
if (!sParser .Create ())
return false;
 
// Open the file
 
FILE *fp = fopen (pszFileName, "r");
if (fp == NULL)
return false;
 
// Loop while there is data
 
bool fSuccess = true;
while (!feof (fp) && fSuccess)
{
char* pszBuffer = (char*) sParser .GetBuffer (256); // REQUEST
if (pszBuffer == NULL)
fSuccess = false;
else
{
int nLength = fread (pszBuffer, 1, 256, fp); // READ
fSuccess = sParser .ParseBuffer (nLength, nLength == 0); // PARSE
}
}
 
// Close the file
 
fclose (fp);
return fSuccess;
}
 

int main()
{
ParseSomeXML("parseforest.xml");
}
 

 

GeneralRe: Linker errors toomemberTim Smith27 Jun '05 - 3:40 
Did you define XML_STATIC as discussed in the Expat documentation?
 
Tim Smith
 
I'm going to patent thought. I have yet to see any prior art.
GeneralRe: Linker errors toomemberRoadRashKing15 Dec '06 - 4:31 
Actually,
 
When I do a #define XML_STATIC in StdAfx I get link errors, If I put it elsewhere (like in my parser file), everything goes smoothly but it still doesn't link statically.
 
For some reason it looks like it is linking against libexpatw.lib when I have libexpatwMT.lib added to my libs in VS
 
1>libexpatw.lib(LIBEXPATW.dll) : error LNK2005: _XML_ParserFree already defined in libexpatwMT.lib(xmlparse.obj)
1>libexpatw.lib(LIBEXPATW.dll) : error LNK2005: _XML_SetStartElementHandler already defined in libexpatwMT.lib(xmlparse.obj)
1>libexpatw.lib(LIBEXPATW.dll) : error LNK2005: _XML_SetEndElementHandler already defined in libexpatwMT.lib(xmlparse.obj)
1>libexpatw.lib(LIBEXPATW.dll) : error LNK2005: _XML_SetCharacterDataHandler already defined in libexpatwMT.lib(xmlparse.obj)
1>libexpatw.lib(LIBEXPATW.dll) : error LNK2005: _XML_GetSpecifiedAttributeCount already defined in libexpatwMT.lib(xmlparse.obj)
 

Generallinker errorsmemberalschmid10 Jun '05 - 0:18 
hey tim
 
maybe I'm just too much of a newbie, maybe I'm stupid...
I'm trying to use CExpatImpl... I get these linker errors when trying to build (using embedded visual c++ 4):
 
_____________________________________________________________________________________
Linking...
LexNet.obj : error LNK2019: unresolved external symbol __imp__XML_SetUserData referenced in function "public: bool __thiscall CExpatImpl<class RechtsgebieteLeser>::Create(unsigned short const *,unsigned short const *)" (?Create@?$CExpatImpl@VRechtsg
ebieteLeser@@@@QAE_NPBG0@Z)
LexNet.obj : error LNK2019: unresolved external symbol __imp__XML_ParserCreate_MM referenced in function "public: bool __thiscall CExpatImpl<class RechtsgebieteLeser>::Create(unsigned short const *,unsigned short const *)" (?Create@?$CExpatImpl@VRec
htsgebieteLeser@@@@QAE_NPBG0@Z)
LexNet.obj : error LNK2019: unresolved external symbol __imp__XML_SetStartElementHandler referenced in function "public: void __thiscall CExpatImpl<class RechtsgebieteLeser>::EnableStartElementHandler(bool)" (?EnableStartElementHandler@?$CExpatImpl@
VRechtsgebieteLeser@@@@QAEX_N@Z)
LexNet.obj : error LNK2019: unresolved external symbol __imp__XML_SetEndElementHandler referenced in function "public: void __thiscall CExpatImpl<class RechtsgebieteLeser>::EnableEndElementHandler(bool)" (?EnableEndElementHandler@?$CExpatImpl@VRecht
sgebieteLeser@@@@QAEX_N@Z)
LexNet.obj : error LNK2019: unresolved external symbol __imp__XML_SetCharacterDataHandler referenced in function "public: void __thiscall CExpatImpl<class RechtsgebieteLeser>::EnableCharacterDataHandler(bool)" (?EnableCharacterDataHandler@?$CExpatIm
pl@VRechtsgebieteLeser@@@@QAEX_N@Z)
LexNet.obj : error LNK2019: unresolved external symbol __imp__XML_ParserFree referenced in function "public: void __thiscall CExpatImpl<class RechtsgebieteLeser>::Destroy(void)" (?Destroy@?$CExpatImpl@VRechtsgebieteLeser@@@@QAEXXZ)
emulatorDbg/LexNet.exe : fatal error LNK1120: 6 unresolved externals
Error executing link.exe.
 
_____________________________________________________________________________________
 
What am I doing wrong? What can I do to fix this? What is the problem?
Thanks a lot!
al
GeneralRe: linker errorsmemberTim Smith10 Jun '05 - 3:51 
The class is an expat wrapper. You need to include Expat in your solution.
 
Tim Smith
 
I'm going to patent thought. I have yet to see any prior art.
GeneralRe: linker errorsmemberalschmid10 Jun '05 - 23:46 
tim, thank you for your quick answer.
do I understand you correctly: do you mean #include "expat.h"? I already do that and it still doesn't work -- I still get the linker errors...
 
thanks!
al
GeneralLicence Questionmemberkdd29@msstate.edu14 Dec '04 - 11:10 
I know you've addressed the licence issue in the message board already, but I just want to be sure it's safe before I do anything.
 
I would like to use ExpatImpl in a public-domain speech recognition project, but if I do I'd like to make an installation package for it and make it available on our website alongside our primary installation package. See this URL for information about our group:
 
http://www.isip.msstate.edu/projects/speech/index.html[^]
 
We would use Expat and the ExpatImpl wrapper to parse XML format files,
W3C Speech Recognition Grammars specifically, and miscellaneous other XML format items in the future. We could make a configurable installation package for ExpatImpl and make it available for download so that users could install our speech recognition suite and configure ExpatImpl for use by it.
 
We would greatly appreciate your contribution if you allow us to use ExpatImpl for this purpose.
 
Thanks,
 

Kyle Duncan
Undergraduate Research Assistant
Intelligent Electronic Systems
Center for Advanced Vehicular Systems
Mississippi State University
GeneralExtracting ValuesmemberCliffWoodger10 Nov '04 - 23:45 
Dear Sir,
Am very pleased with your Expat C++ wrapper. However, I was wondering how to go about extracting the value of a specific element in an XML string. For example if I have XML such as:
{zt}
{jobqueuepath}\ZT\JobQueue{/jobqueuepath}
{jobreturnpath}\ZT\JobReturn{/jobreturnpath}
{preprocessdelayms}250{/preprocessdelayms}
{/zt}
(NOTE: I've replaced the less-than and greater-than characters with curly brackets to avoid any html formatting problems in this thread)
So if, for example, I want to get the value of the /zt/preprocessdelay element, how do I go about it. Ideally, I would like a function such as:
LPSTR GetValue(LPCSTR sPath)
Any help would be greatly appreciated. Thank you.
GeneralRe: Extracting ValuesmemberTim Smith11 Nov '04 - 4:55 
See the first parser example. It implements the three basic element events required to do what you need.
 
Tim Smith
 
I'm going to patent thought. I have yet to see any prior art.
GeneralWhy templatesmemberWerner BEROUX9 Nov '04 - 10:42 
May be it's dumb, may be it's because I didn't looked further: Why do you us templates?
 
For example I see you use pThis->OnPostCreate() to call the derived OnPostCreate function. But why didn't you made just a
virtual OnPostCreate() = 0;
in your header? Confused | :confused:
 
Okay you put a class that works without having to use templates. By the way I didn't saw pure virtual in it. Anyway I don't know the use of making the template base implementation.
 
Hello World!
GeneralRe: Why templatesmemberTim Smith9 Nov '04 - 13:53 
If you make the methods pure virtual then all derived classes must implement these methods in some form or another. This would basically mean that every derived class must do all the work thus making the class mostly pointless.
 
As far as why templates. With templates the amount of generated code "can" be much smaller and the end result faster. For example, if you have a base class implementation of a virtual routine and the derived class implementation, both would end up in the final executable. Using the template method, only the ones that are actually called end up in the image.
 
Tim Smith
 
I'm going to patent thought. I have yet to see any prior art.
GeneralRe: Why templatesmemberWerner BEROUX10 Nov '04 - 9:12 
Faster and smaller? Sounds interresting even if it's a small change. Strange that even for empty functions it does make a change. Like for: virtual void OnCharsDatas(const char *szDatas) {}.
 
Thanks for the answer.
 
Hello World!
GeneralRe: Why templatesmemberJohnny Casey18 Mar '05 - 10:51 
I don't know if "I'm going to patent thought [sic]." makes any sense, but if you meant you were going to patent using C++ in this way consider that the Microsoft ATL/WTL libraries (discussed on this site) use this construct heavily (the later is now on sf.net and the former comes with Visual C++ or can be found on the Platform SDK). So I hope you applied for that patent about 6 or more years ago...
GeneralRe: Why templatesmemberrobiwano4 May '09 - 8:46 
I hope you understand that that is his "signature"... Wink | ;)
GeneralProblems with OnCharacterData Methodememberoutlast30 Sep '04 - 10:18 
Hi I've problem with the OnCharacterData methode. it 's give me something strange back:
 
+ pszData 0x003a61db " c:\data_text
c:\data_sound


32
800
600


 
i simple want the string between the start and and tag. Cry | :((
GeneralRe: Problems with OnCharacterData MethodememberTim Smith30 Sep '04 - 15:06 
That looks fine to me. Are you not using the length?
 
Tim Smith
 
I'm going to patent thought. I have yet to see any prior art.
GeneralRe: Problems with OnCharacterData Methodememberoutlast30 Sep '04 - 22:29 
hmm.. following doesn't work   int your
     void OnCharacterData (const XML_Char *pszData, int nLength)
 
methode.
 
            content.top().insert(content.top().length(), pszData, nLength);
 
defintions for the variables:
     std::stack<std::string> content;
 
it crashs, do you have any suggestions?
 

GeneralRe: Problems with OnCharacterData MethodememberTim Smith1 Oct '04 - 3:31 
You will have to debug your program. Nobody else has problems with that method. It is just a simple wrapper around the EXPAT callback.
 
Tim Smith
 
I'm going to patent thought. I have yet to see any prior art.
GeneralRe: Problems with OnCharacterData Methodememberoutlast1 Oct '04 - 3:52 
yes i know! Smile | :) I'm going to have a debug-session at this weekend! Thank you anyway for your answers!
Generalelements are not printedmemberprashant.battu@prudential.co.uk22 Apr '04 - 23:00 
As I am using a old C++ compiler, I had to use Roguewave's RWBoolean instead of bool and usual cast operator (brackets) instead of static_cast. But when I run the example, it does not print the elements even though Parse() method returns true.
 
Can anybody help?
Below is the code copied from this website... where there are lines to print the elements.
 
void OnPostCreate ()
{
// Enable all the event routines we want
EnableStartElementHandler ();
EnableEndElementHandler ();
// Note: EnableElementHandler will do both start and end
EnableCharacterDataHandler ();
}
 
// Start element handler
 
void OnStartElement (const XML_Char *pszName, const XML_Char **papszAttrs)
{
cout<<"We got a start element " << pszName;
return;
}
 
// End element handler
 
void OnEndElement (const XML_Char *pszName)
{
printf ("We got an end element %s\n", pszName);
return;
}
 
// Character data handler
 
void OnCharacterData (const XML_Char *pszData, int nLength)
{
// note, pszData is NOT null terminated
printf ("We got %d bytes of data\n", nLength);
return;
}

 
Prash
GeneralConflict between CExpatImpl and other codememberkdd29@msstate.edu12 Apr '04 - 9:12 
I have derived a class from CExpatImpl,
and I need to store a pointer in a Ulong
(unsigned long) class object as a data
member of my derived class.
 
When I try to do this, I can compile
my program, but during linking I encounter
many "undefined reference to: (Ulong related
methods)", when the methods are indeed defined.
 
The problem ONLY occurs when trying to instantiate
a Ulong object within the CExpatImpl derived
class. Not when instantiating the Ulong object
somewhere else in the same file and scope as where
a CExpatImmpl derived object exists, and not when all
the header files are included.
 
Do you have any idea why I can use Ulong and
CExpatImpl in the same file, but can't use
Ulong inside of CExpatImpl? Or why trying to do
so makes the linker unable to find select bits of
code in the other libraries I'm linking with?
 

 
Thanks,
- Kyle Duncan
GeneralRe: Conflict between CExpatImpl and other codememberTim Smith12 Apr '04 - 10:02 
Without an example, I have no idea what you are trying to do.
 
Tim Smith
 
I'm going to patent thought. I have yet to see any prior art.
Generalparsing mutiple line xml structures one line at a timememberkdd29@msstate.edu3 Mar '04 - 6:28 
The reason I had asked about passing the parser object
to a function is this:
 
I need to read XML documents line by line and feed them
to the parser. This means Expat needs to be informed that
each parse (until the last) is not the last parse of the
document, so that when XML_Parse encounters end tags whose
corresponding start tags were parsed in a previous call,
the parse does not fail.
 
I know XML_Parse must be passed a boolean flag to indicate
whether each call is the final call, and I made some changes
to allow me to pass said flag, but despite that, all calls to
Parse after the first one fail. Basically I had modified your
ParseSomeXML example by putting the call to Parse in a loop and
trying to call it multiple times. Why is it that only one call
to the Parse method may be made?
 
I need to find a way to parse something like
 
<superitem>
<item>1</item>
</superitem>
 
one line at a time; it thinks end tag </superitem> has no matching start
tag, and the call to Parse returns false.
 
Or, perhaps you have another way of parsing documents in chunks? If so
would you post a small example?
 
I'm just having trouble parsing anything less than a complete XML structure
(from start tag to end tag), which means I have to load the entire structure
in memory to Parse it, which eliminates some of the memory efficiency expat
is meant to provide.
 
Thanks,
Kyle Duncan
GeneralRe: parsing mutiple line xml structures one line at a timememberTim Smith3 Mar '04 - 8:24 
The second "ParseSomeXML" example shows how do parse an XML file where you have control over the I/O. There is no need to modify the Parse routine. The ParseBuffer routine is intended for this use.
 
Tim Smith
 
I'm going to patent thought. I have yet to see any prior art.
GeneralRe: parsing mutiple line xml structures one line at a timememberkdd29@msstate.edu3 Mar '04 - 10:29 
I got it working a few minutes after I sent in the question, and felt like an idiot.
 
Thanks,
Kyle
GeneralRe: parsing mutiple line xml structures one line at a timesussGunasekaran Dharman29 Sep '04 - 0:20 
Hi,
 
Iam also facing the same problem. Kindly let me know how did you solve this problem.
 
Thanks
Guna
GeneralCannot pass derived parser to a functionmemberkdd29@msstate.edu27 Feb '04 - 6:13 
I derived a class XMLParser from CExpatImpl.
 
Passing an XMLParser object to a function and then
calling Parse inside that function causes a segmentation
fault.
 
Passing the XMLParser object by reference prevents this fault, but the parse
fails instead.
 
Why must the parser be declared in the same scope which calls Parse?
Pointers to this object work correctly within the same scope. Passing a
pointer to the XMLParser object to a function and dereferencing that pointer
to call Parse causes the parse to fail.
 
Any help will be much appreciated.
 
Thanks,
Kyle Duncan
GeneralRe: Cannot pass derived parser to a functionmemberTim Smith27 Feb '04 - 14:59 
You can't make a copy of the object due to having to invoke XML_SetUserData to point EXPAT to the class instance. This could have been avoided by moving that call to the parse routine. As is, I should have declared the class with private copy constructors to prevent it from being passed by value.
 
As to why the parse is failing after you pass the instance by pointer, I have no idea. All I can say is check your code. The description of the problem is vague.
 
Tim Smith
 
I'm going to patent thought. I have yet to see any prior art.
Generaltrouble compiling examplememberkdd29@msstate.edu4 Feb '04 - 8:57 
I cannot successfully compile anything
that includes ExpatImpl.h. I receive the
following errors:
ExpatImpl.h:588: variable or field `__cdecl' declared void
ExpatImpl.h:588: parse error before `(' token
ExpatImpl.h:592: syntax error before `->' token
... and so on.
 
I am compiling using gcc 3.2.1 on a Solaris
machine. Do you know why the first "error"
occurs?
 
Thanks,

 
Kyle Duncan
GeneralRe: trouble compiling examplememberTim Smith4 Feb '04 - 14:23 
__cdecl is a Microsoft'ism that is used to tell the compiler that the routine uses a 'C' calling convention. Depending on your compiler, "static" might be all that is needed if the C++ calling convention and C calling conventions are the same. In your case specifically (gcc 3.2.1 on solaris), I don't know the answer to that question.
 
Tim Smith
 
I'm going to patent thought. I have yet to see any prior art.
Generaltrouble setting up (newbie)memberkdd29@msstate.edu30 Jan '04 - 11:09 
I downloaded ExpatImpl.h, and noticed it
requires "expat.h" as ain include file.
What is/where is expat.h? I downloaded
expat seperately but found no such
header file.
 
While your explanation of how to use your
code seems good, I don't know howto get the
code running in order to try.
 
Any help will be much appreciated.
 
-Kyle Duncan
GeneralRe: trouble setting up (newbie)memberTim Smith30 Jan '04 - 14:27 
Look in the lib subfolder in expat from sourceforge.
 
Tim Smith
 
I'm going to patent thought. I have yet to see any prior art.
GeneralExternalEntityRefHandlermemberCaffeine17 Sep '03 - 8:21 
I believe that there are a couple of bugs in some (probably rarely used) handlers.
 
1) The ExternalEntityRefHandler, unlike most other handlers, doesn't get the userData as it's first argument; it should take the expat parser itself. Unfortunately, there is not an elegant fix, as no arbitrary data is passed to this handler at all.
 
2) The UnknownEncodingHandler also doesn't get the userData as its first argument. However, the call to XML_SetUnknownEncodingHandler takes an additional argument, the encodingHandlerData, which is could be used to pass in the wrapper class "this" pointer in place of the userData.
 
Best Regards,
C
GeneralRe: ExternalEntityRefHandlermemberWerner BEROUX21 Nov '04 - 6:59 
I fixed the UnknownEncodingHandler but disabled ExternalEntityRefHandler. I also find it pretty weird that ExternalEntityRefHandler provides an XML_Parser as argument.
 
Here is what is looks like. Note that may have to make some changes since I made a wrapper without templates.
 
// @cmember Enable/Disable unknown encoding handler
void EnableUnknownEncodingHandler(bool fEnable = true)
{
assert(m_p != NULL);
XML_SetUnknownEncodingHandler(m_p,
fEnable ? UnknownEncodingHandler : NULL,
(void *) this);
}
 
...
 
// @cmember Unknown encoding wrapper
static int XMLCALL UnknownEncodingHandler(void *pEncodingHandlerData,
const XML_Char *pszName, XML_Encoding *pInfo)
{
return ((Expat*)pEncodingHandlerData)->OnUnknownEncoding(pszName, pInfo) ? 1 : 0;
}

 
Hello World!
QuestionNew lines?memberSukhUK16 Jul '03 - 9:11 
Hi,
 
I've been using this implementation of Expat and it's excellent. However, I've come to a problem. I've got new lines in some of my XML elements and for some reason the these new lines don't seem to get retrieved. It's as if the XML parser (or Me Smile | :) ignores them. The code I use to combine the string is this:
 

void CScriptParser::OnCharacterData(const XML_Char *wcData, int iLength)
{
if (wcData == NULL || iLength < 1)
return;
 
stElementDetails *stStackElement = v_stNodeStack.back();
 
for (int iCounter = 0; iCounter < iLength; iCounter++)
stStackElement->wsElementValue += wcData[iCounter];
 
return;
}

 
I would appreciate it if someone could tell me what I have to do to be able to get the new lines.
 
Many thanks,
AnswerRe: New lines?memberTim Smith17 Jul '03 - 4:41 
Have you checked to make sure EXPAT is sending you the newlines via OnCharacterData? I don't ever rememeber having a problem with this.
 
Tim Smith
 
I'm going to patent thought. I have yet to see any prior art.
GeneralRe: New lines?memberSukhUK17 Jul '03 - 5:54 
Firstly, I'd like to thank you for your speedy reply Smile | :) Now, on to the problem... I've been doing a bit of investigative work and the file I'm sending uses CR LF for line seperators (I'm working in Windows). Now I'm definately sending the parser the file in "rb" mode so it should send the parser the CR LFs as they are in the file. But under the OnCharacterData I only receive back LFs (as in Unix) and the only thing I can attribute this to is the Expat parser itself. Now I was wondering if there is any way to get back CR LFs from Expat or will I need to add my own CRs before the LFs? (the latter is a bit of a botched job, but should work)
 
To make it work, I did the following:
 
void CScriptParser::OnCharacterData(const XML_Char *wcData, int iLength)
{
	if (wcData == NULL || iLength < 1)
		return;
 
	stElementDetails *stStackElement = v_stNodeStack.back();
 
	for (int iCounter = 0; iCounter < iLength; iCounter++)
	{
		// Ensures new lines are passed in CR LF form and not just LF
		if (wcData[iCounter] == '\n')
			stStackElement->wsElementValue += '\r';
 
		stStackElement->wsElementValue += wcData[iCounter];
	}
	return;
}

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web03 | 2.6.130516.1 | Last Updated 18 Feb 2002
Article Copyright 2002 by Tim Smith
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid