Click here to Skip to main content
Click here to Skip to main content

Tagged as

Xml Serialization on large xml files

, 23 Oct 2010 CPOL
Rate this:
Please Sign up or sign in to vote.
As well all know, xml serialization is a great help in many applications. But it always loads the entire document into memory. For most applications this is not an issue, but what do you do when you have a 100MB xml file (to those whose eyebrows are raising at the thought...yes, they CAN get that big LOL) or perhaps your file is not that insanely massive, but the thought of 5000 objects still makes you seek alternate forms of storage. Well, I'm here to tell you you can still use Xml serialization and not load everything into memory. Sound too good to be true? Read on...
 
Say for example you have an Xml file that has a bunch of configuration options for your app, etc, but there is one node in there with many hundreds or even thousands of child nodes. In your app, you want to iterate over all of those nodes (deserialized into objects of course). I guess you could use Linq to Xml, but I have found that also get pretty slow with larger files. Here's a better way:
 
A combination of XmlTextReader and XmlSerializer - really beautiful!
 
1. Load everything EXCEPT that particular node into memory by using an [XmlIgnore] attribute on the property (or turn it into a method)
 
2. In your class file, change that property to return an IEnumerator; where T is the class representing your troublesome xml node.
 
3. In your new property (or method), use an XmlTextReader (for speed) to go through the xml file and whenever you hit a node with the specified name, deserialize it! Then return it with the yield command and voila! There you have it! See code example below to make things clearer (Dunno about all of you, but personally I generally skip over most of an article and go straight to the code; I find it speaks my language just fine) Big Grin | :-D
 
Here you go:
 
[XmlIgnore]
public IEnumerator<thingy> Thingies
{
   get
   {
      string xml = string.Empty;
      using (FileStream fs = new FileStream(FileName, FileMode.Open, FileAccess.Read))
      {
         using (StreamReader sr = new StreamReader(fs))
         {
            xml = sr.ReadToEnd();
         }
      }
 
      using (StringReader stringReader = new StringReader(xml))
      {
         using (XmlTextReader reader = new XmlTextReader(stringReader))
         {
            reader.WhitespaceHandling = WhitespaceHandling.None;
            bool keepGoing = reader.Read();
 
            while (keepGoing)
            {
               if (reader.IsStartElement())
               {
                  if (reader.Name == "Thingy")
                  {
                     yield return Thingy.Parse(reader.ReadOuterXml());
                  }
                  else { keepGoing = reader.Read(); }
               }
               else { keepGoing = reader.Read(); }
            }
         }
      }
   }
}</thingy>
 
and if you're using the MBG Extensions Library then "Thingy.Parse" can be as simple as:
public static Thingy Parse(string xml)
{
   return xml.XmlDeserialize<thingy>();
}</thingy>
 
Go on, give me a 5/5... you know you want to!!! Big Grin | :-D

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

gordon_matt
Software Developer VortexSoft
Vietnam Vietnam
No Biography provided

Comments and Discussions

 
QuestionMy 5 (but how about SAX parser?) PinmemberAndreas Gieriet3-Jun-12 20:41 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web03 | 2.8.141216.1 | Last Updated 24 Oct 2010
Article Copyright 2010 by gordon_matt
Everything else Copyright © CodeProject, 1999-2014
Layout: fixed | fluid