|

Why XMLite?
In my previous project, I needed some simple XML parser. I worked with Jabber server. Because I had no time I worked with Jabber client library called Jabbercom, a Win32 COM based DLL module. Jabber protocol is based on XML. But that library was not a complete XML parser for my project.
First, it couldn't support Korean text (maybe other languages too) and there is no escape character processing, and no entity encode/decode supports. I had to replace XML parser engine, but I can't use MSXML and Expat, and it had a heavy installation or was hard to use. So I decided to make XMLite. It is not a fully supported XML parser, but it is simple and small, so I hope it helps someone.
Using XMLite
Simply, XMLite has two main data structures, XNode and XAttr. XNode is for XML element node and XAttr is for XML attribute node. XNode has child XNodes and own attributes list XAttrs. If you see my source code, you'll think code is so easy to understand and use. The code is simple.
-
XML parsing
XMLite can parse one XML tag node plain text like below. You can check for parsing error. If XNode::Load returns NULL, then simply you can know plain XML text has some error. If you want to know more error information, then go to section 4 - error handling.
CString sxml;
sxml = _T("\
<TAddress desc='book of bro'>\
<TPerson type='me'><Name>Cho,Kyung Min</Name><Nick>bro</Nick></TPerson>\
<TPerson type='friend'><Name>Baik,Ji Hoon</Name><Nick>bjh</Nick></TPerson>\
<TPerson type=friend><Name>Bak,Gun Joo</Name><Nick>dichter</Nick></TPerson>\
<TInformation count='3'/>\
</TAddress>");
XNode xml;
if( xml.Load( sxml ) )
AfxMessageBox(xml.GetXML());
else
AfxMessageBox(_T("error"));
The result is shown in the picture above.
-
Traveling with parsed XML
CString sxml;
sxml = _T("\
<TAddressBook description=\"book of bro\">\
<TPerson type='me'><Name>Cho,Kyung Min</Name><Nick>bro</Nick></TPerson>\
<TPerson type='friend'><Name>Baik,Ji Hoon</Name><Nick>bjh</Nick></TPerson>\
<TPerson type=friend><Name>Bak,Gun Joo</Name><Nick>dichter</Nick></TPerson>\
<TInformation count='3'/>\
</TAddressBook>");
XNode xml;
xml.Load( sxml );
int i;
XNodes childs;
LPXNode child;
for( i = 0 ; i < xml.GetChildCount(); i++)
{
child = xml.GetChild(i);
AfxMessageBox( child->GetXML() );
}
childs = xml.GetChilds();
for( i = 0 ; i < childs.size(); i++)
AfxMessageBox( childs[i]->GetXML() );
childs = xml.GetChilds(_T("Person") );
for( i = 0 ; i < childs.size(); i++)
{
AfxMessageBox( childs[i]->GetXML() );
}
AfxMessageBox( xml.GetChildAttrValue( _T("Information"), _T("count") ) );
int count = XStr2Int( xml.GetChildAttrValue( _T("Information"),
_T("count") ));
ASSERT( count == 3 );
-
DOM Modify
CString sxml;
sxml = _T("\
<TAddressBook description=\"book of bro\">\
<TPerson type='me'><Name>Cho,Kyung Min</Name><Nick>bro</Nick></TPerson>\
<TPerson type='friend'><Name>Baik,Ji Hoon</Name><Nick>bjh</Nick></TPerson>\
<TPerson type=friend><Name>Bak,Gun Joo</Name><Nick>dichter</Nick></TPerson>\
<TInformation count='3'/>\
</TAddressBook>");
XNode xml;
xml.Load( sxml );
LPXNode child_bro = xml.GetChild(0);
xml.RemoveChild( child_bro );
AfxMessageBox(xml.GetXML());
Result: there is no bro node.
<TAddressBook description='book of bro' >
<TPerson type='friend' >
<Name>Baik,Ji Hoon</Name>
<Nick>bjh</Nick>
</TPerson>
<TPerson type='friend' >
<Name>Bak,Gun Joo</Name>
<Nick>dichter</Nick>
</TPerson>
<TInformation count='3' />
</TAddressBook>
-
Error Handling
XMLite has xml error handling, but it's not complete.
CString serror_xml;
serror_xml = _T("<XML>\
<NoCloseTag type='me'><Name>Cho,Kyung Min</Name><Nick>bro</Nick>\
</XML>");
XNode xml;
PARSEINFO pi;
xml.Load( serror_xml, &pi );
if( pi.erorr_occur )
{
AfxMessageBox( pi.error_string );
AfxMessageBox( xml.GetXML() );
}
else
ASSERT(FALSE);
then, result is
'<NoCloseTag> ... </XML>' is not wel-formed.
-
Entity and Escape Char Test
XMLite has escape process. But this updated version has no escape for general case. But still you can use escape character with parsing value.
pi.escape_value = '\\'
Upper case, escape character is:
'\'
like C/C++. And it has entity processing. Entity table is like shown below:
| Special character |
Special meaning |
Entity encoding |
| > |
Begins a tag. |
> |
| < |
Ends a tag. |
< |
| " |
Quotation mark. |
" |
| ' |
Apostrophe. |
' |
| & |
Ampersand. |
& |
CString sxml;
sxml = _T("<XML>\
<TAG attr='<\\'asdf\\\">'>asdf</TAG>\
</XML>");
XNode xml;
PARSEINFO pi;
pi.escape_value = '\\'
xml.Load( sxml, &pi );
AfxMessageBox( xml.GetXML() );
Result:
<XML>
<TAG attr='<'asdf">' >asdf</TAG>
</XML>
-
Configurate Parse and Display
XMLite can trim when parsing and add newline for display (default).
CString sxml;
sxml = _T("<XML>\
<TAG attr=' qwer '> asdf </TAG>\
</XML>");
XNode xml;
xml.Load( sxml );
AfxMessageBox( xml.GetXML() );
PARSEINFO pi;
pi.trim_value = true;
xml.Load( sxml, &pi );
AfxMessageBox( xml.GetXML() );
DISP_OPT opt;
opt.newline = false;
AfxMessageBox( xml.GetXML( &opt ) );
Result:
first,
<XML>
<TAG attr=' qwer ' > asdf </TAG>
</XML>
after,
<XML><TAG attr='qwer' >asdf</TAG></XML>
-
Custom entity table
XMLite can customize entity table for special parsing and display. You can define new entity table for customized parsing.
CString sxml;
sxml = _T("<XML>\
<TAG attr='&asdf>'></TAG>\
</XML>");
static const XENTITY entity_table[] = {
{ '<', _T("<"), 4 } ,
{ '&', _T("&"), 5 }
};
XENTITYS entitys( (LPXENTITY)entity_table, 2 ) ;
PARSEINFO pi;
XNode xml;
pi.entity_value = true;
pi.entitys = &entitys;
xml.Load( sxml, &pi );
AfxMessageBox( xml.GetXML() );
DISP_OPT opt;
opt.entitys = &entitys;
opt.reference_value = true;
AfxMessageBox( xml.GetXML( &opt ) );
-
branch copy (deep-copy)
Now XMLite can copy branch.
void CTestXMLiteDlg::OnButton9()
{
CString sxml;
sxml = _T(""\"</span>book"" />\
"me"" /><NAME>Cho,Kyung Min</NAME><NICK>bro</NICK>\
"friend"" /><NAME>Baik,Ji Hoon</NAME><NICK>bjh</NICK>\
"friend"" /><NAME>Bak,Gun Joo</NAME><NICK>dichter</NICK>\
"3"" />\
");
XNode xml;
xml.Load( sxml );
AfxMessageBox( xml.GetXML() );
XNode xml2;
xml2.CopyNode( &xml );
AfxMessageBox( xml2.GetXML() );
XNode xml3;
xml3.CopyBranch( &xml );
AfxMessageBox( xml3.GetXML() );
XNode xml4;
xml4.AppendChildBranch( &xml );
AfxMessageBox( xml4.GetXML() );
}
-
Parsing xml with PI/CDATA/Comment
Now XMLite can parse with PI/CDATA/Comment. But Still XMLite doesn't support PI's encoding function.
void CTestXMLiteDlg::OnButton10()
{
CString sxml;
sxml = _T("<?xml version='1.0'?>\
\
<![CDATA[some data]]>\
\
<![CDATA[some data]]>\
value\
<![CDATA[some data2]]>\
<!-- comment2-->");
XNode xml;
xml.Load( sxml );
AfxMessageBox( xml.GetXML() );
}
-
Parsing un-welformed xml (like HTML)
Now XMLite can parse un-welformed xml like HTML with 'force_parse' attribute
void CTestXMLiteDlg::OnButton11()
{
CString sXML = "\
< html>\
< body width='100'>\
Some times I got say...\
\
\
< p>\
Thanks\
< /body>\
< /html>";
XDoc xml;
PARSEINFO pi;
pi.force_parse = true;
if( xml.Load( sXML, &pi ) )
{
LPXNode root = xml.GetRoot();
AfxMessageBox( xml.GetXML() );
}
XNode node;
if( node.Load( sXML ) )
{
AfxMessageBox( node.GetXML() );
}
}
-
Deep-Find
XMLite's node can search own all children with tag name.
CString sXML = "\
\
\
<C/>\
<D/>\
\
";
XNode node;
if( node.Load( sXML ) )
{
AfxMessageBox( node.GetXML() );
LPXNode found = NULL;
found = node.Find( _T("D") );
if( found )
{
AfxMessageBox( found->GetXML() );
}
}
History
License
Sometimes I get email about license of XMLite. You can use/modify/redistribute XMLite for commercial/noncomercial. But please give me thanks email with your project information. Then I will be happy and add it to references of XMLite. If you fix or update XMLite then give it to me for all to have. Thanks.
Reference
| You must Sign In to use this message board. |
|
| | Msgs 1 to 25 of 149 (Total in Forum: 149) (Refresh) | FirstPrevNext |
|
|
 |
|
|
"We are currently using XMLite in our MavBridge product, which converts scanner data files between various formats (many of which are XML). The code has worked very well, and has been very easy to use." MAVRO IMAGING LLC www.MavroImaging.com
They sended thanks mail to me. I hope xmlite can help with their work. I will add to reference section.
Thank you all. No one know oneself all. My idea from you, book, all others. already we are shared brains.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
|
you can download from url like below: (MFC version, but you can easily find, Win32 version, or socket version on googling) just modify that function to save string.
BOOL SaveImage(LPCTSTR szImgPath) { CInternetSession is; CInternetFile* pif = NULL; HANDLE hFile = NULL;
pif = (CInternetFile*)is.Openurl(szImgPath, 1, INTERNET_FLAG_TRANSFER_BINARY); if ( pif == NULL ) return FALSE;
CHAR szLocalFile[255]; wsprintf(szLocalFile, "C:\\%s", pif->GetFileName()); hFile = CreateFile(szLocalFile, GENERIC_WRITE, NULL, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL); if ( hFile == INVALID_HANDLE_VALUE ) { pif->Close(); is.Close(); return FALSE; }
while(1) { INT nRead = 0; DWORD dwWritten = 0; CHAR szBuffer[255];
nRead = pif->Read(szBuffer, 255); if ( nRead == 0 ) break;
WriteFile(hFile, szBuffer, nRead, &dwWritten, NULL); }
CloseHandle(hFile); pif->Close(); is.Close();
return TRUE; }
Thank you all. No one know oneself all. My idea from you, book, all others. already we are shared brains.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
How can i read a specific file with XMLite? I am new with C++ and i have to do this as soon as possible! please help
|
| Sign In·View Thread·PermaLink | 2.00/5 (1 vote) |
|
|
|
 |
|
|
just you can read file to string variable. and load that xml string to xmlite.
char sXml[1000]; FILE* fp = fopen("some.xml", "rt");
long filesize = ftell(fp); fread(sXml, sizeof(char), filesize, fp);
then you have xml string. and load to xmlite
Thank you all. No one know oneself all. My idea from you, book, all others. already we are shared brains.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Hello, I would like to know the memory constraint of this parser on mobile platform (planning to use for WM). The avg size of the XML file to be fed to the parser would be 50KB.
Any suggestions or inputs would be highly appreciated.
Thnkz, Zoom
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
You might be better off with CMarkup, we found this much quicker to load in a large XML file as it only points to elements, attributes in the file rather than copying them. So uses little additional memory to the size of the file itself.
www.firstobject.com/xmleasy.htm
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Just wondered why XMLite impliments the CDATA in the way it does, (a child of a child) is this a normal implementation.
I have the XML line "<Parameter><![CDATA[Window 1]]></Parameter>"
and in order to get the value of the Parameter, I have to do the following:
pAttributeChild = GetChild("Parameter"); strValue = pAttributeChild->GetChildValue("#CDATA");
I thought I would be able to do:
strValue = pAttributeChild->GetChildValue("Parameter");
as I do for all values that are not encapsulated in the CDATA.
Thanks Ian
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Yes. I made that node name with "#CDATA". you can see that function on LoadCDATA.
LPTSTR xml = (LPTSTR)pszXml; xml += sizeof(szXMLCDATAOpen)-1; LPXNode node = new XNode; node->parent = this; node->doc = doc; node->type = XNODE_CDATA; node->name = _T("#CDATA"); _SetString( xml, end, &node->value, FALSE );
par->childs.push_back( node );
If you want, then modify that code for your purpose. and if I have some mistake or miss-standard. then just modify and share that. thanks.
but i wonder, if 'Parameter' node have CDATA and value text like below:
"<Parameter>Value of Parameter<![CDATA[Window 1]]></Parameter>"
then which one (Value of Parameter and CDATA stuff) can be selected for Parameter's value?
Thank you all. No one know oneself all. My idea from you, book, all others. already we are shared brains.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
|
 |
|
|
XNodes::iterator _tagXMLNode::GetChildIterator( LPXNode node ) { XNodes::iterator it = childs.begin(); for( ; it != childs.end() ; ++(it) ) { if( *it == node ) return it; } return NULL; // <-------------------------- not XNodes::iterator }
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
you can find that fix in other's reply.
Thank you all. No one know oneself all. My idea from you, book, all others. already we are shared brains.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Good work. Yet I realize that this isn't a standard-compliant code. XMLite.cpp at line 330 throws me an error: 'i' undeclared identified. You have the following code:
for( int i = 0 ; i < childs.size(); i ++) { ... } childs.clear();
for( i = 0 ; i < attrs.size(); i ++) // <----- i is not declared! { ... }
While compiling it with VC8 it throws me 7 errors and 17 warnings.
Regards.
Hope is the negation of reality - Raistlin Majere
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
You can download this http://download.svetla.org/download/XMLite.ZIP for your project. I´m using it for unicode too. I recommend substitute ->name for ->GetName().
Regards. Jaroslav Nusl www.svetla.org
|
| Sign In·View Thread·PermaLink | 5.00/5 (1 vote) |
|
|
|
 |
|
|
My project is based and compiled with Unicode (VC2005). When I compile this excellent code I get some errors because of that (for example the use of strchr). Can someone know where to get a Unicode version.
Thanks, Yossi.
|
| Sign In·View Thread·PermaLink | 1.00/5 (1 vote) |
|
|
|
 |
|
|
Hi Yossi
I reached that it works for me with Unicode, VC2005. I had to recreate a project as Unicode, and make some corrections... Most important change types are listed: * '<' -> _T('<'), " ?>" -> _T(" ?>"), etc in all similar cases * sizeof(szXMLCDATAOpen) -> sizeof(szXMLCDATAOpen)/sizeof(TCHAR), and similar * memcpy( pss, psz, len); -> #ifdef _UNICODE memcpy( pss, psz, 2*len); #else memcpy( pss, psz, len); #endif * if( strchr( pszchs, *psz ) ) return (LPTSTR)psz; -> #ifdef _UNICODE if( wcschr( pszchs, *psz ) ) return (LPTSTR)psz; #else if( strchr( pszchs, *psz ) ) return (LPTSTR)psz; #endif * std::ostringstream os; -> #ifdef _UNICODE std::wostringstream os; #else std::ostringstream os; #endif There are also many changes to address warnings. If you like, I could send to you my version, or put it in some public place, please instruct.
AlexandreN
|
| Sign In·View Thread·PermaLink | 2.00/5 (2 votes) |
|
|
|
 |
|
|
If you would care to share your Unicode version that'd be great ! Judging by the number of errors I get it sounds quite a big task, especially with regard to std::string which appears to be 'char' based regardless of whether you are compiling UNICODE or not (or have I missed something) ???
TTFN, Jon
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
|
Hi Alexandre,
I'm not sure best way to share. If you get stuck I can temporarily host them on my website so othres can download them, at least until a better solution is found.
TTFN, Jon
PS : I tried to reply to your email address but it bounced ....
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Love xmlite so far but ran into a problem today trying to parse the document.xml file found in the .docx zip file (Word2007 document file). In the .xml file, there are 12 tags with the name "w ". After parsing, xmlite shows only one w tag. I am using the xmlite download from Creative Commons.
I am a complete novice at xml and xmlite, so I may be committing some obviously stupid mistake. However, everything I have parsed prior to this file has acted as expected and thus I don't know why this doesn't act the same. Any ideas?
I have attached the document.xml file below.
thanks for any help you might provide.
="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml"> w:rsidR="001671DB" w:rsidRDefault="00265A14" w:rsidP="00265A14">Heading 1> w:rsidR="00265A14" w:rsidRDefault="00265A14" w:rsidP="00265A14">Heading 1 normal text> w:rsidR="00265A14" w:rsidRDefault="00265A14" w:rsidP="00265A14">Bold Text> w:rsidR="00265A14" w:rsidRDefault="00265A14" w:rsidP="00265A14">Italics Text> w:rsidR="00265A14" w:rsidRDefault="00265A14" w:rsidP="00265A14">Bold Italics Text> w:rsidR="00A424D9" w:rsidRPr="00A424D9" w:rsidRDefault="00A424D9" w:rsidP="00265A14">This is a long run of normal text to show how multiple lines are treated in a single run of really long skinny tall short fat light long text.> w:rsidR="00265A14" w:rsidRDefault="00265A14" w:rsidP="00265A14">Heading2> w:rsidR="00265A14" w:rsidRDefault="00265A14" w:rsidP="00265A14">Heading2 normal text> w:rsidR="00265A14" w:rsidRDefault="00265A14" w:rsidP="00265A14">Heading 3> w:rsidR="00265A14" w:rsidRDefault="00265A14" w:rsidP="00265A14">Heading 3 normal text> w:rsidR="00265A14" w:rsidRDefault="00265A14" w:rsidP="00265A14"/> w:rsidR="00265A14" w:rsidRPr="00265A14" w:rsidRDefault="00265A14" w:rsidP="00265A14"/>
|
| Sign In·View Thread·PermaLink | 2.00/5 (1 vote) |
|
|
|
 |
|
|
 |
|
|
This is just what i was looking for! Its a great little library and now it compiles cleanly in VS2005!
Would it be possible to update the article with the latest sourcecode from Creative Commons?
Ive given you a 5!
|
| Sign In·View Thread·PermaLink | 2.00/5 (1 vote) |
|
|
|
 |
| | |