![]() |
General Programming »
Algorithms & Recipes »
Parsers
Intermediate
License: The Code Project Open License (CPOL)
XMLite: simple XML parser.By Cho, Kyung-minEasy to access and simple XML parser |
C++/CLI, VC6, VC7.NET 1.0, Win2K, WinXP, PocketPC 2002, MFC, STL, Dev
|
|
Advanced Search |
|
|
|
||||||||||||||||

In my previous project, I needed some simple XML parser. I worked with Jabber server. Because I had no time I worked with Jabber client library called Jabbercom, a Win32 COM based DLL module. Jabber protocol is based on XML. But that library was not a complete XML parser for my project.
First, it couldn't support Korean text (maybe other languages too) and there is no escape character processing, and no entity encode/decode supports. I had to replace XML parser engine, but I can't use MSXML and Expat, and it had a heavy installation or was hard to use. So I decided to make XMLite. It is not a fully supported XML parser, but it is simple and small, so I hope it helps someone.
Simply, XMLite has two main data structures, XNode and XAttr. XNode is for XML element node and XAttr is for XML attribute node. XNode has child XNodes and own attributes list XAttrs. If you see my source code, you'll think code is so easy to understand and use. The code is simple.
XMLite can parse one XML tag node plain text like below. You can check for parsing error. If XNode::Load returns NULL, then simply you can know plain XML text has some error. If you want to know more error information, then go to section 4 - error handling.
CString sxml; sxml = _T("\ <TAddress desc='book of bro'>\ <TPerson type='me'><Name>Cho,Kyung Min</Name><Nick>bro</Nick></TPerson>\ <TPerson type='friend'><Name>Baik,Ji Hoon</Name><Nick>bjh</Nick></TPerson>\ <TPerson type=friend><Name>Bak,Gun Joo</Name><Nick>dichter</Nick></TPerson>\ <TInformation count='3'/>\ </TAddress>"); XNode xml; if( xml.Load( sxml ) ) AfxMessageBox(xml.GetXML()); else AfxMessageBox(_T("error")); // simple parsing error check
The result is shown in the picture above.
CString sxml; sxml = _T("\ <TAddressBook description=\"book of bro\">\ <TPerson type='me'><Name>Cho,Kyung Min</Name><Nick>bro</Nick></TPerson>\ <TPerson type='friend'><Name>Baik,Ji Hoon</Name><Nick>bjh</Nick></TPerson>\ <TPerson type=friend><Name>Bak,Gun Joo</Name><Nick>dichter</Nick></TPerson>\ <TInformation count='3'/>\ </TAddressBook>"); XNode xml; xml.Load( sxml ); int i; XNodes childs; // DOM tree Childs Traveling // method 1: Using GetChildCount() and GetChild() // Result: Person, Person, Person, Information LPXNode child; for( i = 0 ; i < xml.GetChildCount(); i++) { child = xml.GetChild(i); AfxMessageBox( child->GetXML() ); } // method 2: LPXNodes and GetChilds() ( same result with method 1 ) // Result: Person, Person, Person, Information childs = xml.GetChilds(); for( i = 0 ; i < childs.size(); i++) AfxMessageBox( childs[i]->GetXML() ); // method 3: Selected Childs with GetChilds() // Result: Person, Person, Person childs = xml.GetChilds(_T("Person") ); for( i = 0 ; i < childs.size(); i++) { AfxMessageBox( childs[i]->GetXML() ); } // method 4: Get Attribute Vaule of Child // Result: 3 AfxMessageBox( xml.GetChildAttrValue( _T("Information"), _T("count") ) ); int count = XStr2Int( xml.GetChildAttrValue( _T("Information"), _T("count") )); ASSERT( count == 3 );
CString sxml; sxml = _T("\ <TAddressBook description=\"book of bro\">\ <TPerson type='me'><Name>Cho,Kyung Min</Name><Nick>bro</Nick></TPerson>\ <TPerson type='friend'><Name>Baik,Ji Hoon</Name><Nick>bjh</Nick></TPerson>\ <TPerson type=friend><Name>Bak,Gun Joo</Name><Nick>dichter</Nick></TPerson>\ <TInformation count='3'/>\ </TAddressBook>"); XNode xml; xml.Load( sxml ); // remove 'bro node' LPXNode child_bro = xml.GetChild(0); xml.RemoveChild( child_bro ); AfxMessageBox(xml.GetXML());
Result: there is no bro node.
<TAddressBook description='book of bro' >
<TPerson type='friend' >
<Name>Baik,Ji Hoon</Name>
<Nick>bjh</Nick>
</TPerson>
<TPerson type='friend' >
<Name>Bak,Gun Joo</Name>
<Nick>dichter</Nick>
</TPerson>
<TInformation count='3' />
</TAddressBook>
XMLite has xml error handling, but it's not complete.
CString serror_xml; serror_xml = _T("<XML>\ <NoCloseTag type='me'><Name>Cho,Kyung Min</Name><Nick>bro</Nick>\ </XML>"); XNode xml; PARSEINFO pi; xml.Load( serror_xml, &pi ); if( pi.erorr_occur ) // is error_occur? { //result: '<NoCloseTag> ... </XML>' is not wel-formed. AfxMessageBox( pi.error_string ); AfxMessageBox( xml.GetXML() ); } else ASSERT(FALSE);
then, result is
'<NoCloseTag> ... </XML>' is not wel-formed.
XMLite has escape process. But this updated version has no escape for general case. But still you can use escape character with parsing value.
pi.escape_value = '\\'
Upper case, escape character is:
'\'like C/C++. And it has entity processing. Entity table is like shown below:
| Special character | Special meaning | Entity encoding |
|---|---|---|
| > | Begins a tag. | > |
| < | Ends a tag. | < |
| " | Quotation mark. | " |
| ' | Apostrophe. | ' |
| & | Ampersand. | & |
CString sxml; sxml = _T("<XML>\ <TAG attr='<\\'asdf\\\">'>asdf</TAG>\ </XML>"); XNode xml; PARSEINFO pi; pi.escape_value = '\\' // using escape character on value string xml.Load( sxml, &pi ); AfxMessageBox( xml.GetXML() );
Result:
<XML>
<TAG attr='<'asdf">' >asdf</TAG>
</XML>
XMLite can trim when parsing and add newline for display (default).
CString sxml; sxml = _T("<XML>\ <TAG attr=' qwer '> asdf </TAG>\ </XML>"); XNode xml; xml.Load( sxml ); AfxMessageBox( xml.GetXML() ); PARSEINFO pi; pi.trim_value = true; // trim value xml.Load( sxml, &pi ); AfxMessageBox( xml.GetXML() ); DISP_OPT opt; opt.newline = false; // no new line AfxMessageBox( xml.GetXML( &opt ) );
Result:
first,
<XML>
<TAG attr=' qwer ' > asdf </TAG>
</XML>
after,
<XML><TAG attr='qwer' >asdf</TAG></XML>
XMLite can customize entity table for special parsing and display. You can define new entity table for customized parsing.
CString sxml; sxml = _T("<XML>\ <TAG attr='&asdf>'></TAG>\ </XML>"); // customized entity list static const XENTITY entity_table[] = { { '<', _T("<"), 4 } , { '&', _T("&"), 5 } }; XENTITYS entitys( (LPXENTITY)entity_table, 2 ) ; PARSEINFO pi; XNode xml; pi.entity_value = true; // force to use custom entitys pi.entitys = &entitys; xml.Load( sxml, &pi ); AfxMessageBox( xml.GetXML() ); DISP_OPT opt; opt.entitys = &entitys; opt.reference_value = true; // force to use custom entitys AfxMessageBox( xml.GetXML( &opt ) );
Now XMLite can copy branch.
void CTestXMLiteDlg::OnButton9() { // TODO: Add your control notification handler code here CString sxml; sxml = _T(""\"</span>book"" />\ "); XNode xml; xml.Load( sxml ); AfxMessageBox( xml.GetXML() ); // copy one level node with its own attributes XNode xml2; xml2.CopyNode( &xml ); AfxMessageBox( xml2.GetXML() ); // copy branch of other node (deep-copy) XNode xml3; //same with xml3 = xml; xml3.CopyBranch( &xml ); AfxMessageBox( xml3.GetXML() ); // append copied-branch of other node as my child XNode xml4; //same with xml3.CopyBranch( &xml ); xml4.AppendChildBranch( &xml ); AfxMessageBox( xml4.GetXML() ); }"me"" /><NAME>Cho,Kyung Min</NAME><NICK>bro</NICK> \"friend"" /><NAME>Baik,Ji Hoon</NAME><NICK>bjh</NICK> \"friend"" /><NAME>Bak,Gun Joo</NAME><NICK>dichter</NICK> \"3"" />\
Now XMLite can parse with PI/CDATA/Comment. But Still XMLite doesn't support PI's encoding function.
void CTestXMLiteDlg::OnButton10() { // TODO: Add your control notification handler code here CString sxml; sxml = _T("<?xml version='1.0'?>\ \ <![CDATA[some data]]>\ \ <![CDATA[some data]]>\ value\ <![CDATA[some data2]]>\ <!-- comment2-->"); XNode xml; xml.Load( sxml ); AfxMessageBox( xml.GetXML() ); }
Now XMLite can parse un-welformed xml like HTML with 'force_parse' attribute
void CTestXMLiteDlg::OnButton11() { // TODO: Add your control notification handler code here CString sXML = "\ < html>\ < body width='100'>\ Some times I got say...\ \ \ < p>\ Thanks\ < /body>\ < /html>"; XDoc xml; PARSEINFO pi; pi.force_parse = true; if( xml.Load( sXML, &pi ) ) { LPXNode root = xml.GetRoot(); //root->AppendChild( _T("child"), _T("value") ); AfxMessageBox( xml.GetXML() ); } // you can't not parse without force_parse on un-welformed xml! XNode node; if( node.Load( sXML ) ) { AfxMessageBox( node.GetXML() ); } }
XMLite's node can search own all children with tag name.
CString sXML = "\ \ \ <C/>\ <D/>\ \ "; XNode node; if( node.Load( sXML ) ) { AfxMessageBox( node.GetXML() ); LPXNode found = NULL; found = node.Find( _T("D") ); if( found ) { AfxMessageBox( found->GetXML() ); } }
// XMLite : XML Lite Parser Library // by bro ( Cho,Kyung Min: bro@shinbiro.com ) 2002-10-30 // History. // 2002-10-29 : First Coded. Parsing XMLElelement and Attributes. // get xml parsed string ( looks good ) // 2002-10-30 : Get Node Functions, error handling ( not completed ) // 2002-12-06 : Helper Funtion string to long // 2002-12-12 : Entity Helper Support // 2003-04-08 : Close, // 2003-07-23 : add property escape_value. (now no escape on default) // 2003-10-24 : bugfix) attribute parsing <TAG \r\n a="1" /> is now ok // 2004-03-05 : add branch copy functions
Sometimes I get email about license of XMLite. You can use/modify/redistribute XMLite for commercial/noncomercial. But please give me thanks email with your project information.
Then I will be happy and add it to references of XMLite. If you fix or update XMLite then give it to me for all to have. Thanks.
General
News
Question
Answer
Joke
Rant
Admin
|
PermaLink |
Privacy |
Terms of Use
Last Updated: 11 Oct 2004 Editor: Sean Ewington |
Copyright 2002 by Cho, Kyung-min Everything else Copyright © CodeProject, 1999-2009 Web18 | Advertise on the Code Project |