Click here to Skip to main content
12,698,913 members (23,245 online)
Click here to Skip to main content


47 bookmarked

XML optimization

, 11 Sep 2002
This is a set of techniques aimed to audit design metadata from any XML stream

<table width="100%"><tr><td class="subsection"><a href="javascript:showhelper('_ContentSpacing')" title="Help on Content spacing" class="subsectionlink">?</a> Content spacing</td></tr></table>
<div id="_ContentSpacing" class="helper" style="border-top:1px solid green; border-bottom:1px solid green" >
Help on <b>Content spacing</b> : <br>
The Content spacing topic has two indicators. The first is the indentation size ratio, as many Xml streams are gracefully decorated with useless spaces and tabs. The other indicator reveals if there is a significant amount of multiple spaces between attributes.<br> Both indentation and multiple spaces are useless.
 <a href="javascript:showhelper('_ContentSpacing',0)" class="link">Close this one</a></div>
<div id="_ContentSpacingIndentation" class="helper" style="border-top:1px solid green; border-bottom:1px solid green" >
Help on <b>Content Indentation</b> : <br>
Indentation is a decoration, and is aimed to make Xml human readable but it has a high overhead in size. Let's see the sample below :<br>
<script language="javascript">writebookshop()</script>
This one is 150 bytes. Now let's show the <i>same</i> content without indentation :<br>
&lt;?xml version=&quot;1.0&quot; encoding=&quot;ISO-8859-1&quot;?&gt;
&lt;!DOCTYPE Bookstore SYSTEM &quot;bookshop.dtd&quot;&gt;
&lt;Bookstore&gt;&lt;!--J&R Booksellers Database--&gt;&lt;Book Genre=&quot;Thriller&quot;In_Stock=&quot;Yes&quot;&gt;&lt;Title&gt;The Round Door&lt;/Title&gt;&lt;/Book&gt;&lt;/Bookstore&gt;
This one reduces the size down to 136 bytes. If we take the whole Bookshop.xml sample, where we have only 4 books declared by the way, then the gain is 16%. As an experimental rule, the indentation ratio increases with the Xml stream size and may even overcome the real content size. <a href="javascript:showhelper('_ContentSpacingIndentation',0)" class="link">Close this one</a></div>
<div id="_ContentSpacingIndentationGain" class="helper" style="border-top:1px solid green; border-bottom:1px solid green" >
Help on <b>Content Indentation Gain</b> : <br>
This is the resulting gain if we remove indentation from the whole Xml stream. Average is 10%.
<a href="javascript:showhelper('_ContentSpacingIndentationGain',0)" class="link">Close this one</a></div>
<div id="_ContentSpacingWhites" class="helper" style="border-top:1px solid green; border-bottom:1px solid green" >
Help on <b>white spaces in Content</b> : <br>
This indicator shows if there are significant multiple white spaces between attributes in the Xml stream. Of course the multiple white spaces are useless. Only one suffice. Most Xml parsers don't allow no white space at all between attributes (unlike HTML), so most of the time this indicator come out with a <b>not significant</b> multiple white spaces ranking. Below is a sample of multiple white spaces :<br>
  &lt;book year=&quot;1999&quot;  price=&quot;20$&quot;/&gt;
<a href="javascript:showhelper('_ContentSpacingWhites',0)" class="link">Close this one</a></div>

By viewing downloads associated with this article you agree to the Terms of Service and the article's licence.

If a file you wish to view isn't highlighted, and is a text file (not binary), please let us know and we'll add colourisation support for it.


This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


About the Author

Addicted to reverse engineering. At work, I am developing business intelligence software in a team of smart people (independent software vendor).

Need a fast Excel generation component? Try xlsgen.

You may also be interested in...

| Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.170118.1 | Last Updated 12 Sep 2002
Article Copyright 2002 by Stephane Rodriguez.
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid