NanoXML - Simple and fast XML parser

BrokenEvent

4.97/5 (9 votes)

Nov 13, 2013

CPOL

2 min read

56972

3176

Simple and fast .NET XML parser without using System.Xml

Introduction

This is a quite small and fast XML DOM parser on the .NET platform (using .NET 2.0). The Main feature and main demand for this is not to use the System.Xml namespace. Also tests shows amazing performance results compared with built-in .NET parsers.

Background

The idea of such a thing was born when one of my friends needed something to parse XML on C# without using the System.Xml namespace. The writing and testing took about three hours. The parser doesn't support the entire XML specification but it is capable of parsing most XML I've tried.

Using the code

Usage of NanoXML is simple. You just need to add NanoXMLParser.cs to your project and use the TObject.Shared namespace. The main top-level class is NanoXMLDocument that parses XML and builds DOM.

NanoXML is not capable of loading data from files, so you may load an XML string with your application itself. Something like this:

FileStream fs = new FileStream(args[0], FileMode.Open, FileAccess.Read);
byte[] data = new byte[fs.Length];
fs.Read(data, 0, (int) fs.Length);
fs.Close();
 
string strData = Encoding.UTF8.GetString(data);
NanoXMLDocument xml = new NanoXMLDocument(strData);

Yes, NanoXML ignores XML declaration's encoding attribute Smile | <img src= Now, after we have loaded the document, we can get data for any Element or any of its attributes:

string myAttribute = xml.RootNode["Subnode"].GetAttribute("myAttribute");

NanoXML also ignores comments and DOCTYPE declarations. XML declarations (<?xml ?>) will be parsed and stored in a NanoXMLDocument object.

Performance

The most amazing thing in this parser is its performance. Before submitting the code here, I tried some benchmarks on the parser and compared it with built-in .NET parsers and I was surprised. All tests were performed on an 1.1 MB SVG file in string (i.e., without disk access overhead). Test results are shown below on the screenshot:

As we can see, NanoXML processes an 1.1 MB file almost immediately (17 ms). XmlDocument loaded document in 11 seconds. XmlReader (a SAX parser which by design should be much faster than DOM) reads the whole document for about 7 seconds. For the XmlReader, the test doesn't do anything but read file content from beginning to end. Benchmark test sources are available for download.

Points of Interest

This parser may be useful because of its great performance or when using built-in parsers (System.Xml namespace) is forbidden.