Click here to Skip to main content
Click here to Skip to main content

XML Databases

, 21 Apr 2012
Rate this:
Please Sign up or sign in to vote.
JSON-oriented document stores like Mongo and Couch have really become the darlings of the web application crowd. Of course, JSON (or BSON) isn't the only game in town. When it comes to document store formats, the other white meat if you will is XML.

In spite of some notable opposition, NoSQL has been all the rage lately. In particular, JSON-oriented document stores like Mongo and Couch have really become the darlings of the web application crowd.

Of course, JSON (or BSON) isn’t the only game in town. When it comes to document store formats, the other white meat if you will is XML.

As far as I could find, XML databases have actually been around longer than JSON based ones (the first XML database eXist was introduced in 2000, whereas the first JSON based one Couch DB came on the scene about 5 years later). Yet in spite of this head start, XML databases appear to be the red-headed step child. I was curious as to why this is the case, and here’s what I found.

XML vs. JSON

Let’s first consider the two formats in question. While it’s debatable which format is more commonly used for storage, it’s a lot less debatable which is considered to be the hipper of the two. Spoiler alert: it’s JSON.

XML is Worse

So, why is XML bad? Well, a big knock against XML is that it’s too “heavy weight” and “enterprisy”.

First, it’s obviously more verbose:

XML JSON
<complaintsAgainstXML>
   <complaint>it's too verbose</complaint>
   <complaint>it's too complex</complaint>
   <complaint>XSLT sucks</complaint>
</complaintsAgainstXML>
{
    complaintsAgainstXml:{
        complaint:[
            'it\'s too verbose',
            'it\'s too complex',
            'XSLT sucks'
        ]
    }
}

Not counting white space, XML takes 171 characters whereas JSON takes 86. This amounts to almost 50% fewer characters for JSON, which makes it a much better format for transporting data over a distributed network (at least in uncompressed form).

XML is also more complex because it allows both attributes and elements, whereas JSON limits it to just elements. Some say that JSON parsers are available more languages than XML. And of course JSON can be natively processed in the browser, which makes it for a much better “X” in Ajax (Doug Crockford’s quote, note mine).

XML is Better

On the other hand, XML has a bunch of useful supporting technologies around it:

  • validations against a predefined schema using XML Schemas, Schematrons, and DTDs
  • traversal using XPath
  • transformations with XSLT
  • searching with XQuery
  • referencing other XML with XLink or XInclude

Now, there is no doubt that working with some of them can be painful (I’m looking at you XSLT). The tooling isn’t great, debugging is awkward, testability is questionable, etc.

Moreover, similar versions of some of these also exist for  JSON. For instance there is JSON Path and JSON Schema. That said, I’m not sure how widely utilized they are.

XML Databases

Ok, let’s finally get back to the main point of this post: XML databases. Here’s a small sample of the capabilities you typically get with them:

  • XML CRUDS (create, retrieve, update, delete, and search via XQuery)
  • Document validation (using XML Schema)
  • Document references (via XInclude or XLink)
  • Library services (versioning, diffing, branching)
  • Storing non-xml but meta-tagged content (like images)

Of these, only CRUDS operations are well represented in JSON-based stores. Mongo DB, for example, has pretty advanced querying capabilities supported by database indexes.

Other capabilities are much more common (if not unique) to XML databases. Consider for example document references. In XML, you can reference one document from another using XInclude:

<menu>
   <menuItem>
      <xi:include href="menuItems/BeefStroganof.xml" />
   </menuItem>
   <menuItem>
      <xi:include href="menuItems/RasperryIceTea.xml" />
   </menuItem>
</menu>

XML databases which have support for XInclude (like Mark Logic) will automatically resolve the reference and return to you a complete document using basically a single line of code.

Final Thought

JSON based document stores certainly have their appeal, especially for web applications built on a complete JavaScript stack (some browser library + Node.js + Mongo). The fact that your data can flow effortlessly through the entire stack is really really nice.

That said, XML databases do have unique and useful capabilities which can save you a lot of effort, if you need them. Hence, don’t dismiss XML databases off-hand just because XML isn’t cool.

You may also like:

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

AlexTatiyants

United States United States
No Biography provided
Follow on   Twitter   Google+

Comments and Discussions

 
GeneralMy vote of 4 Pinmembersachin4dotnet28-Feb-13 16:22 
Questionvote Pinmemberinf1n1te28-Apr-12 6:03 
GeneralMy vote of 5 Pinmembermember6027-Apr-12 22:39 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web02 | 2.8.140827.1 | Last Updated 21 Apr 2012
Article Copyright 2012 by AlexTatiyants
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid