Click here to Skip to main content
Click here to Skip to main content
Go to top

Introduction to DataSets and working with XML files

, 18 Jul 2002
Rate this:
Please Sign up or sign in to vote.
This article gives you an introduction to .NET's DataSets and how you can use them with XML files
<!-- Download Links --> <!-- Add the rest of your HTML here -->


This article gives you an introduction to using DataSets and how to use them with XML files. Working with them really makes your life easier when you want to share data from a data source and you are thinking of XML.

System Requirements

To compile the solution you need to have Microsoft Visual Studio .NET installed. To run any of the client executable you need to have the .NET framework installed.

The sample provided is a simple application written in C#. It displays a form with a DataGrid. By default, the application connects to a SQL Server, binds the Northwind database and fetches some parent records from table Customers and some child records from table Orders. By default, the application assumes that you have an instance of SQL Server installed on your machine. If this is not the case, then you must manually modify the connection string and then rebuild the sample.

Screenshot 1 Scrennshot 2

Then, you can save the dataset to XML file. Schema information is also saved.

What is a DataSet?

A DataSet object is a very generic object that can store cached data from a database in a very efficient way. It is member of the System::Data namespace.

One obvious question is: When to use a dataset? Well, the answer is: it depends. You should consider that a dataset is a collection of in-memory cached data. So it's good to use datasets when:

  • You are working with multiple separated tables or tables from different data sources.
  • You are exchanging data with another application such as a Web Service.
  • You perform extensive processing with the records in the database. If you use a SQL query every time you need to change something, processing each record may result in connection being held open which may affect performance.
  • You want to perform XML/XSLT operations on the data.

You should not use a dataset if:

  • You are using Web Forms in your application because Web Forms and their controls are recreated each time a page is requested by the client. Thus, creating, filling and destroying a dataset each time will be inefficient unless you plan to cache it between roundtrips.

A DataSet has a DataTableCollection object as a member that is nothing but a collection of DataTable objects. You declare a DataSet object and add tables to it like this (in Managed C++):

  // Declare the DataSet object
  DataSet* MyDataSet = new DataSet ("MyDataSet"); // give it a name here

  // Add two tables to it
  // - add a Table named Table1
  DataTable* Table1 = MyDataSet->Tables->Add ("Table1"); 
  // - add a Table named Table2
  DataTable* Table2 = MyDataSet->Tables->Add ("Table2"); 

You can refer the tables later either using the pointers returned by the Add method or like this:

  DataTable* table = MyDataSet->Tables->Item[0]; // or 
  DataTable* table = MyDataSet->Tables->Item["Table1"]; 
                                                  // isn't this indexation cool?

A DataTable object has two important members: Rows and Columns. Rows is a DataRowCollection object and Columns is a DataColumnCollection. DataRowCollection is a collection of DataRow objects and DataColumnCollection is a collection of DataColumn objects. I am sure you can easily figure out what these objects represent. Smile | :)

Adding data to a data set is straight-forward:

  // adding data to the first table in the DataSet
  DataTable* Table1 = MyDataSet->Tables->Item[0];

  // add two columns to the table
  Table1->Columns->Add ("Column1");
  Table2->Columns->Add ("Column2");

  // get the collection of rows
  DataRowCollection* drc = Table1->Rows;

  // create a vector of Objects that we will insert in current row
  Object* obj[] = new Object* [2];
  obj[0] = new String ("Item 1");
  obj[1] = new String ("Item 2");

  // add them to the dataset
  drc->Add (obj);

If you want to specify the data type of a particular column you should use the DataColumn::DataType property. You can set any of the following data types: Boolean, Byte, Char, DateTime, Decimal, Double, Int16, Int32, Int64, SByte, Single, String, TimeSpan, UInt16, UInt32, UInt64.

That's it! Well, that's all you have to do if you want to build up your data set manually. This is not how it is done if you want to connect to a real data source.

Binding a Database

To connect to a database server (such as SQL Server) and fill a dataset from there you need three additonal objects: a Connection, a DataCommand and a DataAdapter object.

The Connection object is used to connect to the database object. You must provide a connection string to it. Also, if your application is using transactions, you must attach a transaction object to the connection.

The DataCommand object is used to sending commands to the database server. In includes four System::String objects named SelectCommand, InsertCommand, UpdateCommand and DeleteCommand. I believe it is obvious what these string objects represent, nothing else but the four basic SQL operations.

The DataAdapter object is the object that does all the magic work. It fills the DataSet with data from the database and updates data from the DataSet to the database.

You may now wonder what are the classes that correspond to the objects described above. Well, Microsoft have prepared two sets of classes - you can't say life is getting much easier in .NET can you? Smile | :) . The first set is based on OLE DB and is part of the System::Data::OleDb namespace. It contains the following classes: OleDbConnection, OleDbCommand and OleDbDataAdapter. The other set of classes is optimized for working with Microsoft SQL Server. It is part of the System::Data::SqlClient namespace and its classes include: SqlConnection, SqlCommand and SqlDataAdapter.

Here is a complete example of how to connect to the database and fill the dataset. I have used the classes optimized for SQL Server. If you need to use OLE DB you only have to replace "Sql" with "OleDb". We try to fetch two tables Table1 and Table2 and set a parent-child relationship between them.

  // Create a database connection string
  String* str = new String ("user id=sa;password=;initial catalog=MyDB;"
                            "data source=(local)");
  // Create the database connection object
  SqlConnection* sqlcon = new SqlConnection (str);

  sqlcon->Open (); // open it

  // create the first SQL query
  String* strTable1 = String::Format ("SELECT * FROM Table1 " 
    "WHERE Field1 = {0}", FieldID.ToString ());    // FieldID is 
                   // an interger used to filter our query.

  // create the second SQL query. It joins the first table to select only those
  // fields in relation
  String* strTable2 = String::Format ("SELECT T2.* FROM Table2 T2 "
                   "INNER JOIN Table1 T1 ON T2.ParendID = T1.ID "
                   "WHERE T1.Field1 = {0}", Field1.ToString ());    // FieldID is 
                   // an interger used to filter our query.

  // create the SQL command objects. We pass in the contructor the 
  // SqlConnection object and the query string
  SqlCommand* sqlTable1 = new SqlCommand (strTable1, sqlcon);
  SqlCommand* sqlTable2 = new SqlCommand (strTable2, sqlcon);

  // Create a data adapter for every table. We pass the SqlComand objects as parameters
  SqlDataAdapter* Table1Adapter = new SqlDataAdapter (sqlTable1);
  SqlDataAdapter* Table2Adapter = new SqlDataAdapter (sqlTable2);

  // now we create the dataset object and we give it a name
  DataSet* MyDataSet = new DataSet ("MyDataSet");

  // we inform the dataset object about the tables it is going to contain
  // by adding those tables to the dataset. 

  DataTable* Table1 = BackupDataSet->Tables->Add ("Table1");
  DataTable* Table2 = BackupDataSet->Tables->Add ("Table2");

  // now we are filling the Dataset using the Dataadapter objects
  // We need not say anything to the DataSet object about the 
  // columns of the table and their data type. The DataAdapter objects
  // takes care of everything.

  Table1Adapter->Fill (Table1);
  Table2Adapter->Fill (Table2);

  // To ensure relationships between Tables we must add a DataRelation object
  // We assume that between column 0 in Table1 and column 1 in Table2 there is 
  // a one-to-many relationship

  MyDataSet->Relations->Add (Table1->Columns->Item[0],

For details about relations and constraints you should read about the Constraint class and the two classes derived from it, ForeignKeyConstraint and UniqueConstraint.

Working with XML files

DataSets can work with XML files very easily. There are two methods to serialize a DataSet object. These are DataSet::WriteXml and DataSet::WriteXmlSchema. The first one writes data to the XML file and may include also schema information. It is useful when you want to write an XML file that has schema information embedded. However, if you prefer schema information in a separate (.xsd) file you should use the DataSet::WriteXmlSchema method.

There are also plenty of classes in the System::Xml namespace: XmlWriter, XmlReader, XmlTextWriter, XmlDataDocument to name a few. You can use those with a dataset to perform some advanced XML operations. For instance, if you want to write a dataset to an XML file you can either use

  // recevies a DataSet in constructor
  XmlDataDocument* xmlDoc = new XmlDataDocument(MyDataSet);
  XmlTextWriter* Output = new XmlTextWriter ("C:\\Myfile.xml", NULL);

  // perform some formatting
  Output->Formatting = Formatting::Indented;
  Output->Indentation = 2;

  // and write it
  xmlDoc->WriteTo (Output);
  Output->Close ();

or use the DataSet::WriteXml method directly:

MyDataSet->WriteXml ("C:\\MyFile.xml", XmlWriteMode::WriteSchema);

In the latter situation I chose to embed schema information in the file by using one member of the XmlWriteMode enumeration. Other fields of the enumeration are XmlWriteMode::IgnoreSchema if you do not want to include schema information, XmlWriteMode::DiffGram if you want to include both original values of the dataset and any changes. For reading an XML file we use another enumeration: XmlReadMode.

A DataRelation has a property named Nested. When this property is set to true then every time the DataSet will write a record of a parent table, it will also nest all corresponding records in all child tables.

Formatting data in an XML file is very flexible. By default, a new element is created for every column in every table. Assuming you have a table named Table1 with two columns named ID and Name the default output of an XML file will be:

      <Name>name 1</Name>
      <Name>name 2</Name>

If one column is to become an attribute of the node then you will set ColumnMapping property of DataColumn class. For that you must have a look at the MappingType enumeration. Its fields are: Attribute, Element, Hidden and SimpleContent. Choosing Attribute will write the corresponding column as an attribute of the parent node. The output will be like this:

    <Table1 ID="7">
      <Name>name 1</Name>
    <Table1 ID="8">
      <Name>name 2</Name>

SimpleContent means that tags for one column will not be written. If you chose Hidden then that column will not be written at all.

Of course you can combine them very easily. Doing this:

  Table1->Columns->Item[0]->ColumnMapping = MappingType::Attribute;
  Table2->Columns->Item[1]->ColumnMapping = MappingType::SimpleContent;

will give you the following results:

    <Table1 ID="7">name1</Table1>
    <Table1 ID="8">name2</Table1>

Reading from an xml file is just as easy as writing. The simplest options is to use the ReadXml method like this:

  MyDataSet->ReadXml ("C:\\MyFile.xml", XmlReadMode::ReadSchema);

I have read also schema information from the file. This means that the DataSet will automatically detect data type for every column in all tables in the dataset and also any constraints or relations between tables. This is really cool if you want to update a dataset. You change a value in the parent table and all child tables will have the parent value updated. Also, trying to change a value in a child table that does not have a parent value will throw an exception.

Updating the Database

Reading data into a dataset and then updating a database is just as easy as reading from a data source and filling a dataset. Assuming you are reading from an XML file and you are updating a SQL server, you must do the following:

  • Create the DataSet object and read the XML file. If your XML file contains any schema information then the DataSet will detect and create the corresponding tables and enable any constraints automatically.
  • Create a DataAdapter object for every table you want to update.
  • Call the Update method for every DataAdapter object.

Remember that I mentioned four String members of the SqlDataAdapter class? These are: SelectCommand, InsertCommand, UpdateCommand and DeleteCommand. SelectCommand is the query used to fetch data from the database. You can define the other three objects if you want to perform custom update/insert/delete operations. If you do not want to do that you can use either SqlCommandBuilder or OleDbCommandBuilderclass. This class will build those strings automatically.

Whenever you are writing data to a database is good practice to use a transaction to prevent concurrent writes to the database. For this. the .NET framework provides two classes: OleDbTransaction and SqlTransaction respectively.

Here is a sample of reading data from an XML file and the writing to a SQL Server.

  SqlTransaction* SqlTrans; // declare a transaction object
    String* str = new String ("user id=sa;password=;initial catalog=MyDB;"
                              "data source=(local)");
    // Create the database connection object
    SqlConnection* sqlcon = new SqlConnection (str);

    sqlcon->Open (); // open it

    // create the data set object and give it a name
    DataSet* MyDataSet = new DataSet ("MyDataSet");

    // read the XML file
    // I have also read the schema information file
    MyDataSet->ReadXml ("C:\\MyXmlFile.xml", XmlReadMode::ReadSchema);

    // Begin the transaction
    SqlTransaction = sqlcon->BeginTransaction ();

    // create the data adapters
    SqlDataAdapter* Table1Adapter = new SqlDataAdapter("SELECT * FROM Table1", sqlcon);
    SqlDataAdapter* Table2Adapter = new SqlDataAdapter("SELECT * FROM Table2", sqlcon);

    // we have provided only the SelectCommand strings. To update
    // the database we must provide the DeleteCommand, InsertCommand and
    // UpdateCommand also.
    // This can be done automatically with the command builder

    // create the command builders for each data adapter
    SqlCommandBuilder* Table1Command = new SqlCommandBuilder (Table1Adapter);
    SqlCommandBuilder* Table2Command = new SqlCommandBuilder (Table2Adapter);

    // we must specify the transaction used by these adapter.
    Table1Adapter->SelectCommand->Transaction = SqlTrans;
    Table2Adapter->SelectCommand->Transaction = SqlTrans;

    // update the database
    Table1Adapter->Update (MyDataSet, "Table1");
    Table2Adapter->Update (MyDataSet, "Table2");

    // don't forget to commit

    SqlTrans->Commit ();

  catch (Exception* e)
    // if we have started the transaction then rollback it
    if (SqlTrans != NULL)


DataSets provide a very easy to use and powerful way to handle large amounts of data coming from different tables or even data sources. A dataset will cache all data making it useful when you need to read data, perform intensive operations with it and then update it. It also provides full XML support making your life easier to share data across applications such as Web Services.


This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


About the Author

Alexandru Savescu
Web Developer
United States United States
I work as a Software Project Manager for United Management Technolgies developing Project Portfolio Management applications.
I have a degree in Computer Sciences from Politechnica University of Bucharest.
I enjoy programming very much, my favourite languages is C++. Other interests include basketball, crossword-puzzles and traveling.

Comments and Discussions

GeneralMy vote of 5 PinmemberLeKoosy31-Mar-11 9:41 
Generalautomatically generate create, delete, insert and update methods from an xml file Pinmemberkennyzita28-Sep-09 2:33 
GeneralError when I update database PinmemberMember 470070123-Jul-08 17:55 
GeneralLoading XML files into DataSet Pinmemberdaluu28-Jun-08 18:15 
Generaldatabase to xml, xml to dataset PinmemberSarfaraj Ahmed13-Feb-07 2:43 
GeneralBinding to a dropdownlist Pinmemberanyaalonzo9-Aug-05 10:20 
GeneralRe: Binding to a dropdownlist PinmemberAlexandru Savescu9-Aug-05 20:40 
QuestionHow to use XML in place of database Pinmemberseikou30-May-05 13:20 
AnswerRe: How to use XML in place of database PinprotectorMarc Clifton30-May-05 15:46 
QuestionCan we execute a query to a DataSet Pinmemberbayramakgul7-Jan-05 2:07 
AnswerRe: Can we execute a query to a DataSet PinmemberMyPaq18-May-05 23:16 
QuestionHow to display xml data in rich edit box? Pinmemberguruprasadl29-Oct-04 0:30 
QuestionHow to insert data in a database from an Xml file with relational data!!! Pinmemberpesko29-Oct-04 0:13 
GeneralMore than one xml file PinmemberVipul Bhatt4-Oct-04 7:01 
GeneralRe: More than one xml file PinmemberAlexandru Savescu10-Nov-04 1:03 
GeneralNew Database table from XML file PinsussAnonymous20-Sep-04 14:26 
GeneralRe: New Database table from XML file PinmemberAlexandru Savescu22-Sep-04 5:03 
QuestionHow can I create a relationship in XML Pinmemberpostmannvercome11-Sep-04 16:12 
AnswerRe: How can I create a relationship in XML PinmemberAlexandru Savescu12-Sep-04 21:35 
GeneralRe: How can I create a relationship in XML Pinmemberpostmannvercome13-Sep-04 16:02 
QuestionHow to update datasource from Xml file PinmemberAmi Shah3-Aug-04 0:35 
AnswerRe: How to update datasource from Xml file PinmemberAlexandru Savescu3-Aug-04 1:06 
I am not sure I understand correctly what you are doing but here's my guess.
  • You created a dataset from a XML file with ReadSchema attribute. This means that you loaded schema information from the file, but no data.
  • You updated the XML file. You need to make changes to the dataset object, not the xml file. My suggestion is that you either read both the schema and data from the XML file (using the ReadXml method) after making changes to the file, or you modify the DataSet instead of the file.
    Alexandru Savescu

  • GeneralRe: How to update datasource from Xml file PinmemberAmi Shah3-Aug-04 1:29 
    GeneralRe: How to update datasource from Xml file PinmemberAlexandru Savescu3-Aug-04 1:49 
    GeneralRe: How to update datasource from Xml file PinmemberAmi Shah6-Aug-04 0:51 

    General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

    Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

    | Advertise | Privacy | Mobile
    Web01 | 2.8.140916.1 | Last Updated 19 Jul 2002
    Article Copyright 2002 by Alexandru Savescu
    Everything else Copyright © CodeProject, 1999-2014
    Terms of Service
    Layout: fixed | fluid