Click here to Skip to main content
Click here to Skip to main content

How to Open Large XML files without Loading the XML Files?

By , 12 Feb 2011
 

Working with XML files in memory is always a performance issue. It becomes more important to look into the processing of XML files which are heavy in size (let's say more than 3 GB). So questions comes in mind that how to process such heavy XML files.

When we think of working with any XML file, we normally think of using:

  • XMLDocument
  • DataSet.ReadXml()
  • XPathDocument

When we use the above options, we are loading the files into the system memory.

The problem is that, if the size of the XML file is for e.g. 5 GB to 7 GB, we have to load the complete file in System’s memory. This will cost us systems memory and will throw “System out of Memory Exception”.

The best approach to process such large files is to avoid loading the files into the memory.

Microsoft has provided with XmlTextReader class. XmlTextReader helps us to process the XML file line by line. In this way, we are not loading the complete XML file into the memory but processing the file line by line, node by node.

Here is the code snippet that shows an example of how to use XMLTextReader class:

XmlTextReader myTextReader = new XmlTextReader(filename);
myTextReader.WhitespaceHandling = WhitespaceHandling.None;
while (myTextReader.Read())
{
    if (myTextReader.NodeType == XmlNodeType.Element &&
        myTextReader.LocalName == "Reward" &&
        myTextReader.IsStartElement() == true)
        {
            ProcessRewardNode(myTextReader);
                myTextReader.Skip();
    }
}

Here is the method implementation of ProcessRewardNode:

private void ProcessRewardNode(XmlTextReader RewardReader)
{
    XmlDocument RewardXmlDoc = new XmlDocument();
    RewardXmlDoc.LoadXml(RewardReader.ReadOuterXml());
    // we can use xpath as below
    myID = RewardXmlDoc.SelectSingleNode("Reward/myID").InnerText;
}

Here code itself tells you lots of things, so I am not discussing it more here. You can look into MSDN of XMLTextReader for more information.

Hope this will help !!!

Jay Ganesh


License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Sandeep Ramani
Software Developer (Senior) CapGemini Pvt Ltd
India India
Member
Sandeep is a passionate .NET developer.

He is also certified as Microsoft Certified Technologies Specialist - Web Applications Development with Microsoft .NET Framework 4.
 
He is also awarded as Microsoft Community Contributor of the year 2011.
 
He has also received several awards at various forums and his various articles got listed as "Article of the day" at ASP.NET Microsoft Official Website www.asp.net.

He has done MCA from Gujarat University.
 
Visit his Blog:
http://ramanisandeep.net


Area of Expertise:
C#, ASP.NET , AJAX, Java script, JQuery, JSON, XML, XSLT, ADO.NET, Web Services, WCF, SSIS 2005, SQL Server 2005/2008
 
He is fond of movies, music, cricket, hockey and boxing.

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
Generalregarding myID = RewardXmlDoc.SelectSingleNode("Reward/myID").InnerText;memberAngsuman Chakraborty5 Mar '11 - 21:15 
myID = RewardXmlDoc.SelectSingleNode("Reward/myID").InnerText;
 
Hi there Sandeep Ramani,
Thanks for this good article. It would really help the developers world.
 
I have an inquire about the .selectsinglenode method.
 
I have a dataset which writes an xml file using Dataset.Writetoxml() in .Net 4.0;
 
But the problem is if a particular cell in data table has empty or null value, the writetoxml(); method skips that attribute.
 
Suppose i have a column in data table , lets call this client_address. some client may not provide their address.
If the client_address is empty, the Write to xml method doesnot write .
This attribute is skipped.
 
So when i use
myclientAddress = RewardXmlDoc.SelectSingleNode("Reward/Client_Addess").InnerText; it gives me error, since this tag doesn't exists.
 
my question is how can i determine if this element exists in the xml file.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.130523.1 | Last Updated 13 Feb 2011
Article Copyright 2011 by Sandeep Ramani
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid