This article introcuce a universal way to get data from XML document for report.
( Chinese version: http://www.cnblogs.com/xdesigner/archive/2006/08/31/490943.html )
With the developing of B/S system and XML technology popularization , more and more data package into XML to save and throw out .Those data mostly come from database , and worked , it is more reduce and more close to application . If report tool and use those XML data , it can reduce query on database and compute data , because the creater of XML data has done thos work . So I say XML is a new mainland of report datasource .
Old report tools mostly only can read data from RDBMS , it can not handl other kind data . but with the developing of the time , some tools append ability to handle XML document , but developer need programing , create and config extend-ware , so the report system is complex and with nimiety interface , developer need observe many stand , it there are many XML format , developer have to create many kind of extend-ware , the work is large .
It people realization a kind of universal method to handle XML document , collect report data from all kind of XML document .you can define some report template , and tell report engine than how to read data from XML document , then you need not develop extend-ware to handld all of or mostly of XML document , this can reduce work .
Then how to universal handle XML document which complicated and like a tree ?
We know , there are two style to handle XML document , one is DOM style , the other is SAX sytle . DOM style is convenient , but it is slow and need many memory , SAX style is fast , need not many memory , but is not convenient . In .NET framework , use object System.Xml.XmlDocument to realization DOM style , use XmlReader to realization SAX style .
W3C organize design XML for save and exchange a few of data convenient , not think over redundance . So if there are exist huge XML document in one system , mostly it think that use XML technology out of the way . I think the report tool is need not handle large big XML document , so for expedient realization , I use DOM style to handle XML document .
In .NET framework , use a XmlDocument object load a XML document , and you can get a XML DOM tree which it's root is XmlDocument , to a XML DOM tree , it is natural to use XPath technology to read data from XML DOM tree . XPath is a language for addressing partys of an XML document .
Old report datasource model is two layers , first layer is datasource , the 2th is field . So in the abstract , it can handle XML document once time , just from root element of XML document , use XPath to get field value . and after once time read , the XML document can not use again . but in many time , It need read more data from XML document , then the old two-layers report datasource model is not enough .
For read more data from XML document , I think out a multi-layer datasource model , in this model , datasource structure can view as a tree , each node of datasource tree can mapping a node in XML dom tree use an XPath express . the children of a datasource node can mapping an xml node in the same way . So in this way , after cycle and recursion , multi-layer datasource structure can mapping an XML dom tree unlimited . multi-layer datasource model is a tree , XML DOM is a tree too , so the mapping work is bind tow tree together , and XPath is the nail . In this way , it is notice than XPath setting , if one datasource node 's XPath setingg is mistake , just like the node bind mistke XML node or missing , all of children is missing .
It fact , XML document is not create for report , so report tool maybe leave XML document for more data , it maybe get data from another XML document or execute query on RDBMS . This character is test agility of report tool .
We know , RSS document is a kind of XML document , so at there , we use RSS documment as an example to discuss how to read data from XML document . At first , we realize the structure of RSS document . there are a RSS document which url is http://blogs.msdn.com/xmlteam/rss.xml . it's root is rss , then has a channel child node that include some base information for rss document , then has some item node that list all of article . item node include some information of an article , the content of wfw:commentRess node is an url of feedback rss document for one articel . your can use this url to load feedback rss document , base on RSS document structure , we can define a report datasource structure and the mapping from datasource to XML document .
The rss document has treee-layer structure , and need load feedback document dynamic for read more data , this work is complex ,old two-layer datasource model can not handle this work . this work complete by the following steps .
- Load XML document http://blogs.msdn.com/xmlteam/rss.xml as a mainly XML document , instance a System.Xml.XmlDocument object , and the object is the jumping-off point.
- Use XPath "rss" , enumerate elements which match the xpath , Obviously , only get one element , then set current element as rss element.
- At current element , use XPath "channel/title" to get the title of web site , use "channle/link" to get the url of site , "channle/description" to get site description.
- use XPath "channel" , enumerate elements , obviously , only one "channel" element match , then set "channel" elemenet as current element .
- Use XPath "item" , enumerate elements which match this xpath , in each enumerate , set one item element as current element .
- use XPath "title" to get title of article , "link" to get the url of article , "author" to get author's name , "pubDate" to get the publish date , "description" to get description of article , "slash:comment" to get the number of feedback , "wfw:ommentRss" to get the url of feedback rss document.
- when handle "wfw:commentRss" element , report tool use it's cotent as an url , load xml document dyanmic , and use XPath "rss/channel/item" to enumerate element , in each enumerate , set on matched element as current element .
- use "author" to get the author's name of feedback , use "pubDate" to get date of feedback , "description" to get content for feedback.
- because in rss document , "description" contain HTML string , so it is need to parse the html string and get text content .
From those steps , you can see that , each datasource node bind one element of XML document , and during get feedback data , report tool execute XML's jump , it jump from mainly XML document to feedback XML document . And because of handle tree structure , use recursion , many program state information save in call-stack automatic , we need not handle those .
It report tool can connection database of site , it can jump from author element to database , query database and get some registe information of author . In fact , each node in the datasource tree can execte jump between in XML-XML , XML-RDBMS , RDBMS-XML , this character increase the power of report tool greatly.
If a system is a 100% XML application , Report tool can jumping in XML documents and collect report data ,then all of database query , business logic run in application internal , report tool do not take care about it .If system is strong , report module is strong , system can modify freely , just only maintain XML document structure , the report module need not modify . When some report's data is very complex and out of range of report tool , developer can develop a special XML document to provide report data . Before , system use report tool's API to send report data , but now use an XML doucment send data to report tool , in this case , the boundary of system is clearly and safe , it is embodiment of XML WebService 's idea . And this method can use to other domain , no only for report .
The discuss limited B/S system , but we can assume , after recast , C/S system can send XML document to report tool in some way .
From the article , we can recognize that XML is a new mainland of report data , and I am already work out a report tool which name is XReport to realization this idea primary . you can download it from http://www.xdesigner.cn/_yuansreport_eng.aspx .