Click here to Skip to main content
Licence CPOL
First Posted 23 Mar 2006
Views 17,391
Downloads 46
Bookmarked 13 times

DataSet Caching in Delimited Text or XML Files

By | 23 Mar 2006 | Article
This article shows the reader a method of caching DataSets that is up to six times faster (and six times smaller) than XML.

Sample Image - datasetcache.jpg

Introduction

A common issue for C# developers is saving the contents of DataSets to local files. The .NET framework provides the ability to write out to XML, but this proves to be extremely slow when reading the data back in, and the XML files generated can also be quite large.

Creating a custom file format can be problematic, however, as DataRows have a number of states and versions that must also stored. This means that there may not be only a single row of data for each DataRow.

The DataSetCache class allows the developer to easily dump a dataset to a delimited text file without any loss of DataRow data. It takes into account the state and the versions of DataRows and, when saved and recovered, the DataSet will be in exactly the same state as before.

The other major advantage with this class is that it has proven to produce cache files approx 1/6th the size of the same data as XML, and also, more importantly, loads the data back into the DataSet up to 6 times faster.

This is my first article and, in fact, I am relatively new to C#. If you find any parts of this code that could be performed in a cleaner or faster way, please let me know.

Class Limitations

Every piece of code has its limitations, and this is no exception:

  • DataRows with a DataRowState.Detached state are not supported. Logically, rows with this state cannot be attached to a DataSet or DataTable.
  • DataRows with a DataRowVersion.Proposed version are not supported. These rows only occur for a DataRowState.Detached state and, as above, logically cannot occur.
  • The \f character has been used as the delimiter. No code has been put in place to filter any \f characters that may have been in the data prior to the cache file write (due to the slowdown this would cause).
  • Standard types have been tested (int, string, DateTime, etc.), but there may be some types that can cause issues. If so, please let me know.

Using the Code

Creating a class object has a number of options:

//The path where the cache files are (or will be)
//(if not specified app path is used)
string dataPath = Environment.CurrentDirectory + "\\";

//This determines the type of cache method. 
//Options are SCSV or SXML. (If not specified SCSV is used) 
cacheType cType = cacheType.SCSV;

//This is a database name to be included in the cache 
//file names (If not specified "default" is used) 
string dBaseName = "mydbase";

//Method A 
DataSetCache myCache = new DataSetCache(); 
myCache.CacheType = cType; 
myCache.DataPath = dataPath; 
myCache.DBase = dBaseName; 

//Method B 
DataSetCache myCache = new DataSetCache(dataPath, cType, dBaseName); 

//Method C 
DataSetCache myCache = new DataSetCache(dataPath);

Saving the DataSet contents to a cache file is quite a simple process:

//Method A (where myData is an existing DataSet)
DataSetCache myCache = new DataSetCache();
myCache.CacheType = cType;
myCache.DataPath = dataPath;
myCache.DBase = dBaseName;
myCache.syncCacheError += new syncMessageDelegate(myCache_syncCacheError);
myCache.syncCacheSaveStart += new syncEventDelegate(myCache_syncCacheSaveStart);
myCache.syncCacheSaveEnd += new syncEventDelegate(myCache_syncCacheSaveEnd);
myCache.saveCache(myData);

//Method B (if you dont want to use events)
new DataSetCache(dataPath, cType, dBaseName).saveCache(myData);

Loading the contents of a cache file to a DataSet is also quite a simple process:

//Method A
DataSetCache myCache = new DataSetCache();
myCache.CacheType = cType;
myCache.DataPath = dataPath;
myCache.DBase = dBaseName;
myCache.syncCacheError += new syncMessageDelegate(myCache_syncCacheError);
myCache.syncCacheLoadStart += new syncEventDelegate(myCache_syncCacheLoadStart);
myCache.syncCacheLoadEnd += new syncEventDelegate(myCache_syncCacheLoadEnd);
DataSet myData = myCache.loadCache(); 

//Method B (if you dont want to use events)
DataSet myData = new DataSetCache(dataPath, cType, dBaseName).loadCache();

The Test Application

The test application contains a number of buttons. Their use is explained below:

  • Fill Table with Default - Creates a DataSet with some default test information in it.
  • Load Cache - Loads the cache file information into the DataSet.
  • Save Cache - Saves the existing DataSet data into the cache file.
  • Row controls:
    • Edit - Replaces the current record data with the edited data in the edit fields.
    • Insert - Adds a new record with data taken from the edit fields.
    • Delete - Deletes the current record.
    • Accept - Accepts the changes to the current record.
  • Accept All Changes - Accepts all changes within the entire DataSet.

Points of Interest

A good example of file size and loading time difference between XML and the DataSetCache SCSV method is a very large DataSet I used to test the system.

The generated XML file was approx 171MB, and it took approx 70 seconds to load. The same data in a DataSetCache SCSV cache file was 25MB and took only 19 seconds to load.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Dan Neilsen

Web Developer

Australia Australia

Member



Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board. (secure sign-in)
 
Search this forum  
 FAQ
    Noise  Layout  Per page   
  Refresh
-- There are no messages in this forum --
Permalink | Advertise | Privacy | Mobile
Web02 | 2.5.120517.1 | Last Updated 23 Mar 2006
Article Copyright 2006 by Dan Neilsen
Everything else Copyright © CodeProject, 1999-2012
Terms of Use
Layout: fixed | fluid