Click here to Skip to main content
11,634,561 members (72,190 online)
Click here to Skip to main content

A generic bulk insert using DataSets and OpenXML

, 14 Apr 2005 172.7K 67
Rate this:
Please Sign up or sign in to vote.
Create T-SQL command text to update a table with OpenXML quickly and with minimal effort.

Introduction

The project I’m working on now requires a way to drop a significant amount of data into 20+ tables. I went looking for Bulk Insert solutions and what I found was these articles in the MSDN.

The article explains how to prepare a table as XML using its DataSet’s schema, then sending it to a stored procedure to execute an Update and Insert. The problem I saw for my needs was that the MSDN solution requires a function and a stored procedure for each table.

So I spent a day and rolled out this code here. It takes a DataSet, an open SQL connection and a table name and writes the command text to execute an OPENXML Bulk Insert.

The Code

There is one calling function and two support functions. The first function takes three parameters:

  • a DataSet,
  • an open SQL connection,
  • and a table name.

and begins by processing the table’s ColumnMapping, just like MSDN’s tutorial. It also streams the DataSet to a StringBuilder as XML. Then, instead of sending the XML to a stored procedure, it sends it to buildBulkUpdateSql which creates the remaining T-SQL script.

/// <summary>
/// Takes a dataset and creates a OPENXML script dynamically around it for 
/// bulk inserts 
/// </summary> 
/// <remarks>The DataSet must have at least one primary key, otherwise it'll wipe 
/// out the entire table, then insert the dataset. Multiple Primary Keys are okay. 
/// The dataset's columns must match the target table's columns EXACTLY. A 
/// dataset column "democd" does not work for the sql column
/// "DemoCD". Any missing or incorrect data is assumed NULL (default).
/// </remarks>
/// <param name="objDS">Dataset containing target DataTable.</PARAM>
/// <param name="objCon">Open Connection to the database.</PARAM>
/// <param name="tablename">Name of table to save.</PARAM>
public static void BulkTableInsert(DataSet objDS, 
                                 SqlConnection objCon, string tablename)
{
    //Change the column mapping first.
    System.Text.StringBuilder sb = new System.Text.StringBuilder( 1000);
    System.IO.StringWriter sw = new System.IO.StringWriter(sb); 
    foreach( DataColumn col in objDS.Tables[tablename].Columns)
    {
         col.ColumnMapping = System.Data.MappingType.Attribute;
    }
    objDS.WriteXml(sw, System.Data.XmlWriteMode.WriteSchema);
    string sqlText = buildBulkUpdateSql(sb.ToString(), objDS.Tables[tablename]);
    execSql(objCon, sqlText);
}

This is where the generic T-SQL text is created. The only magic here is getting the C# escape characters out of the string before sending it to the SqlCommand. Another thing to note is how I’m using the database’s table as the schema to work within the WITH argument so that I don’t have to name each column and DataType.

static string buildBulkUpdateSql( string dataXml, DataTable table)
{
    System.Text.StringBuilder sb = new System.Text.StringBuilder();
    dataXml = dataXml.Replace(Environment.NewLine, "");
    dataXml = dataXml.Replace("\"", "''");
    //init the xml doc
    sb.Append(" SET NOCOUNT ON");
    sb.Append(" DECLARE @hDoc INT");
    sb.AppendFormat(" EXEC sp_xml_preparedocument @hDoc OUTPUT, '{0}'", dataXml);
    //This code deletes old data based on PK.
    sb.AppendFormat(" DELETE {0} FROM {0} INNER JOIN ", table.TableName);
    sb.AppendFormat(" (SELECT * FROM OPENXML (@hdoc, '/NewDataSet/{0}', 1)", 
    table.TableName);
    sb.AppendFormat(" WITH {0}) xmltable ON 1 = 1", table.TableName);
    foreach( DataColumn col in table.PrimaryKey)
    {
        sb.AppendFormat(" AND {0}.{1} = xmltable.{1}", table.TableName, 
        col.ColumnName);
    }
    //This code inserts new data.
    sb.AppendFormat(" INSERT INTO {0} SELECT *", table.TableName);
    sb.AppendFormat(" FROM OPENXML (@hdoc, '/NewDataSet/{0}', 1) WITH {0}", 
    table.TableName);
    //clear the xml doc
    sb.Append(" EXEC sp_xml_removedocument @hDoc");
    return sb.ToString(); 
}

There’s no magic here. This is just a simple command executer. In my actual app I don’t use this and I don’t expect you to use this as well, but for code-completion, here it is:

static void execSql(SqlConnection objCon, string sqlText)
{
    SqlCommand objCom = new SqlCommand();
    objCom.Connection = objCon;
    objCom.CommandType = CommandType.Text;
    objCom.CommandText = sqlText;
    objCom.ExecuteNonQuery();
}

Drawbacks

As the documentation in the function header says, this procedure assumes a few things:

  • The DataTable must have at least one primary key so that it knows what is update data and what is not.
  • The existing data matched by the primary key will be deleted, and then re-inserted. This can be debated as to whether it is a good idea or not. I believe it would speed up the transaction as a whole, but perhaps it is not the most elegant solution. I’m interested in any DBA comments. Doing the update part is easy, but then you would have to do an outer join against the XML table for the Insert data; which could take a while depending on the table size.
  • The DataTable column names must match the database’s table perfectly. This is kind of a bummer, but if you’re auto-magically creating DataSets in the IDE, it shouldn’t matter to you.

TODO

Rip out unnecessary XML before creating the Insert script. DataSets can contain multiple tables, all of which are written out with the DataSet.WriteXml() function. In large samples, this is too much data to be send across and is completely useless.

Conclusion

I’ve tested the heck out of this using small and medium size tables and DataSets. There is tons of room for improvement and feature enhancement and I know this won’t work in many professional environments, but it’s a good start for me that I wish I had two days ago. Like I mentioned, I'm looking forward to hearing from the community about speeding up this snippet a bit.

Links

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Share

About the Author

poodull76
Web Developer
United States United States
writing for the financial and television industries since 1997. Currently working for Transformational Security in Baltimore, Maryland as Architect.

You may also be interested in...

Comments and Discussions

 
GeneralJunk. Use SqlBulkCopy instead. Pin
llitsanylg28-Aug-09 0:38
memberllitsanylg28-Aug-09 0:38 
GeneralRe: Junk. Use SqlBulkCopy instead. Pin
poodull7628-Aug-09 3:49
memberpoodull7628-Aug-09 3:49 
GeneralRe: Junk. Use SqlBulkCopy instead. Pin
dmofo26-Jun-12 7:03
memberdmofo26-Jun-12 7:03 
GeneralProblem with OPENXML ?, Bulk Insert ?, Encoding ? [modified] Pin
Xavito23-Apr-09 17:30
memberXavito23-Apr-09 17:30 
GeneralRe: Problem with OPENXML ?, Bulk Insert ?, Encoding ? Pin
llitsanylg28-Aug-09 0:43
memberllitsanylg28-Aug-09 0:43 
GeneralInserting table with identity column Pin
sriprabu1-May-07 12:24
membersriprabu1-May-07 12:24 
GeneralRe: Inserting table with identity column Pin
poodull761-May-07 15:32
memberpoodull761-May-07 15:32 
GeneralRe: Inserting table with identity column Pin
sriprabu5-May-07 1:44
membersriprabu5-May-07 1:44 
Generalinsert fails when data contains single quote Pin
breakpoint18-Mar-07 22:54
memberbreakpoint18-Mar-07 22:54 
GeneralRe: insert fails when data contains single quote Pin
poodull7619-Mar-07 2:52
memberpoodull7619-Mar-07 2:52 
GeneralSQL Bulk Insert Pin
rhorc1-Dec-06 13:04
memberrhorc1-Dec-06 13:04 
Generaldatetime fails! Pin
species217523-Jan-06 3:49
memberspecies217523-Jan-06 3:49 
GeneralRe: datetime fails! Pin
poodull7623-Jan-06 4:22
memberpoodull7623-Jan-06 4:22 
GeneralRe: datetime fails! Pin
species217523-Jan-06 20:03
memberspecies217523-Jan-06 20:03 
GeneralRe: datetime fails! Pin
sides_dale23-Aug-06 15:20
membersides_dale23-Aug-06 15:20 
GeneralPossible way to speed up inserts/ updates Pin
Megatop2-Oct-05 21:36
memberMegatop2-Oct-05 21:36 
GeneralRe: Possible way to speed up inserts/ updates Pin
sides_dale23-Aug-06 15:23
membersides_dale23-Aug-06 15:23 
Generalgood concept Pin
MP3Observer15-Jun-05 2:35
memberMP3Observer15-Jun-05 2:35 
GeneralI have a Pbm with string data types Pin
Sidhartha Shenoy10-Jun-05 23:42
memberSidhartha Shenoy10-Jun-05 23:42 
GeneralRe: I have a Pbm with string data types Pin
poodull7614-Jun-05 8:37
memberpoodull7614-Jun-05 8:37 
GeneralRe: I have a Pbm with string data types Pin
Sidhartha Shenoy14-Jun-05 18:15
memberSidhartha Shenoy14-Jun-05 18:15 
GeneralBoolean/Bit type Pin
ChrisMcv30-May-05 23:36
memberChrisMcv30-May-05 23:36 
GeneralRe: Boolean/Bit type Pin
poodull7614-Jun-05 8:31
memberpoodull7614-Jun-05 8:31 
AnswerRe: Boolean/Bit type Pin
sides_dale23-Aug-06 15:18
membersides_dale23-Aug-06 15:18 
GeneralRe: Boolean/Bit type Pin
ja92822-Oct-06 9:20
memberja92822-Oct-06 9:20 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web03 | 2.8.150728.1 | Last Updated 14 Apr 2005
Article Copyright 2005 by poodull76
Everything else Copyright © CodeProject, 1999-2015
Layout: fixed | fluid