Smum County IO

Lou Flees

4.04/5 (9 votes)

Apr 3, 2006

CPOL

17 min read

41898

296

An ADO.NET toolbox that is meant to serve as the lowest level database IO layer for an application based on an Access, Microsoft SQL Server, or Oracle database.

Download source files and demo - 141 KB

Introduction

The Smum County IO class is an ADO.NET toolbox that is meant to serve as the lowest level data IO layer for an application based on an Access, Microsoft SQL Server, or Oracle database. It serves as a programmer friendly interface to the intricacies of ADO.NET, allowing the user to concentrate on coding instead of trying to remember how to fill a DataSet. The class implements the following basic data access functions:

Read Data
Write Data
Manage Transactions
Open/Close an ADO Connection

Additionally, this class is meant to conceal the differences between Access, Microsoft SQL Server, and Oracle database access. If this class is used consistently to perform database IO, we can create an application that is compatible with any of these databases without writing database specific code.

Stored Procedures vs. Dynamic SQL

As any database programmer will tell you, using Stored Procedures for database access will be considerably faster than executing SQL statements against the backend. Stored Procedures are not used here, however, in order to achieve the desired database neutrality.

DBTools

DBTools is an interface I've created that serves to organize that code in the IO class which is database specific. Instead of branching the code within the IO class any time there is a case where the code for a database backend differs from that of another, I instead call a method of the DBTools interface which handles the code. I've then created classes that implement the DBTools interface for each supported backend: Access, MS SQL Server, and Oracle. This helps to keep the IO class code clean while making it easier to extend this project to other databases. Some of the methods in the DBTools class serve to instantiate the database specific classes for the generic ADO.NET interfaces, and others simply wrap database specific functions. These methods will be discussed throughout the rest of this article.

Connecting to a data source

Before the IO class can be used, it's necessary to indicate the type of database to be accessed and the connection parameters used to connect to the database. This is done in the class constructor:

public IO(DBType dbType, string dataSource, string userID, 
          string password, string initialCatalog)

Overloaded methods for the constructor allow you to specify the connection parameters without the initialCatalog and without the userID and password. The initialCatalog is only needed when connecting to an MS SQL Server in order to indicate the database to which to connect, and the userID and password won't be necessary if you are connecting to an Access database which is not secured with these parameters.

So to initialize the object, simply do this:

io = new Smum.IO(Smum.IO.DBType.MSSQL, "ServerName", "UserID", "Password", "Database");

Read methods

Once the IO object has been instantiated, reading data from the database is as simple as calling one of the overloaded Read methods. Separate Read methods allow you to read a scalar value, DataSet, DataTable, DataRow, or DataRowCollection. In any case, all the Read methods except for the one that reads a scalar value eventually call the Read method for a DataSet. So, let's take a closer look at this method:

public ReturnStatus Read(ref DataSet dataSet, string sql, string tableName)
{
    try
    {
        if (dataSet == null) dataSet = new DataSet();

        //Keep track of whether this table is new to this dataset.
        bool newTable = (tableName == "" || dataSet.Tables[tableName] == null);

        FillDataSet(ref dataSet, sql, tableName);

        //If no table name was specified, get the name 
        //of the table as indicated by the dataset.
        if (tableName == "")
            tableName = dataSet.Tables[dataSet.Tables.Count - 1].TableName;

        //If this table is new to this dataset, if any columns 
        //are AutoIncrement columns, set the 
        //AutoIncrementSeed and AutoIncrementStep to -1 
        //so that automatically assigned values in the 
        //DataSet will be negative while the ones in the database 
        //will be positive, thus avoiding conflicts.
        if (newTable)
        {
            foreach (DataColumn column in dataSet.Tables[tableName].Columns)
            {
                if (column.AutoIncrement)
                {
                    column.AutoIncrementStep = -1;
                    column.AutoIncrementSeed = -1;
                }
            }
        }

        //Return Success or NotFound.
        if (dataSet.Tables[tableName].Rows.Count == 0)
            return ReturnStatus.NotFound;
        else
            return ReturnStatus.Success;
    }
    catch (System.Exception e)
    {
        throw new Smum.Exception("Error occurred while reading data using: '" + sql + "'", e);
    }
}

All the real work of reading the data is done in the FillDataSet routine, which I'll discuss below. In this routine, you should note that you are allowed to pass in a DataSet to be filled, which already contains data. The tableName parameter allows you to indicate the name of the table in the DataSet which will be filled with the data read.

Notice that if you are populating a new table in the DataSet, and this table contains autoincrement columns, I am initializing the seed and step of these columns so that all new rows added to the table will have negative values and should not conflict with the values of existing rows in the database, which should be positive. If you don't do this, and one of these columns has a unique index, you'll very quickly run into trouble when you bind your DataSet to a DataGrid, which allows the creation of new rows. The DataSet may create a value for the autoincrement column in a new row that conflicts with the column value in a row that already exists in the DataSet, generating an error. Lastly, this method returns a status of Success or NotFound which is standard for all the Read methods.

Now for FillDataSet:

private void FillDataSet(ref DataSet dataSet, string sql, string tableName)
{
    IDbCommand command = 
      dbTools.CreateCommand(sql, Connection, Transaction.Transaction);
    IDbDataAdapter adapter = dbTools.CreateDataAdapter(command);

    if (tableName == "")
    {
        //Fill the schema for the table.
        adapter.FillSchema(dataSet, SchemaType.Source);

        //Fill the dataset.
        adapter.Fill(dataSet);
    }
    else
    {
        //Fill the schema for the table if the table does not yet have any data.
        if (dataSet.Tables[tableName] == null || 
                  dataSet.Tables[tableName].Rows.Count == 0)
            dbTools.FillSchema(adapter, dataSet, tableName);

        //Fill the dataset.
        dbTools.FillDataSet(adapter, dataSet, tableName);
    }
}

This is all pretty standard ADO.NET code. I instantiate an IDbCommand object, passing in the SQL statement, current connection, and current transaction (more on transactions below). Then, I instantiate an IDbDataAdapter object and use this to fill the schema of the data read and to read the data. In order to make this standard code database specific, I rely on the dbTools object which has earlier been instantiated as the correct type to handle the current database backend. The CreateCommand method of this class creates the IDbCommand object which is specific to the current database backend, and the CreateDataAdapter method does the same for the IDbDataAdapter object.

Note that if the table name has been specified in the read, I am calling a method of the dbTools object in order to fill the schema and read the data. This is necessary because the FillSchema and Fill methods of the IDbDataAdapter interface don't allow you to specify a table name, but the database specific implementations of IDbDataAdapter do allow this. So, in order to specify a table name for FillSchema and Fill, I coerce the IDbDataAdapter object to the correct data adapter type for the specific database and then call the FillSchema or Fill method on this coerced object. Since this code is database specific, I've placed it in the backend specific DBTools classes. For example, here's the SqlTools implementation of FillSchema:

public void FillSchema(IDbDataAdapter adapter, DataSet dataSet, string tableName)
{
    SqlDataAdapter nativeAdapter = adapter as SqlDataAdapter;
    if (nativeAdapter != null)
        nativeAdapter.FillSchema(dataSet, SchemaType.Source, tableName);
    else
        adapter.FillSchema(dataSet, SchemaType.Source);
}

Note that I never bother to explicitly open the connection to the database. I leave this up to the DataAdapter. If the connection is closed when the Fill method is called, the DataAdapter will open the connection, read the data, and then close the connection. If the connection is open when the Fill method is called, the DataAdapter will read the data and leave the connection open. This is just what I want.

The DataTable, DataRow, and DataRowCollection read methods all call this DataSet read method and then return the DataTable, DataRow, or DataRowCollection read. If you want to use the Write methods of the Smum.IO class to write the changes back to the database, it is important that every DataTable, DataRow, and DataRowCollection read contain only data from a single table and that you pass the name of this table to the Read method when reading the data. This is how the Write methods know what table to write the data back to. If you're not planning on changing the data read or you don't plan to use the Write methods to update the database, then the table name passed to the Read methods doesn't matter.

Write methods

This is where the use of the Smum.IO class really pays off. After making changes to the data read through the Read methods, you can simply write the data back to the database by calling one of the overloaded Write methods. One caveat, however: in order to write out data using these methods, the TableName of each DataTable to write to must be specified and the DataTable must have one or more primary keys. This is how these write methods identify the table to be written to and the rows within the table to be updated.

There is a Write method which corresponds to each of the Read methods to allow you to write data from a DataSet, DataTable, DataRow, or DataRowCollection. Each of these routines essentially routes to the Write method for a DataRow. The DataSet write calls the DataTable write for each of the tables in the DataSet, which calls the DataRowCollection write for its DataRowCollection, which calls the DataRow write for each of the rows in the collection.

Let's examine the DataRow write method:

public void Write(DataRow dataRow) 
{
    bool transactionBegun = false;
    try
    {
        if (dataRow.RowState != DataRowState.Unchanged)
        {
            //Insure that the table name of the datarow 
            //was previously set. If not, throw an exception.
            if (dataRow.Table.TableName == "Table" || dataRow.Table.TableName == "")
                throw new Smum.Exception("TableName of DataSet Table has not been set.");

            DataColumnCollection dataColumns = dataRow.Table.Columns;
            DataColumn[] keyColumns = dataRow.Table.PrimaryKey;

            if (keyColumns.Length == 0)
                throw new Smum.Exception("No primary key has been defined.");

            string sql = "";
            IDbCommand command = dbTools.CreateCommand("", Connection, 
                                         Transaction.Transaction);
            bool autoIncrementedKey = false;
            string tableName = dbTools.GetSqlTableName(dataRow.Table.TableName);

            //If the row is not attached to a table, add it to its parent table.
            if (dataRow.RowState == DataRowState.Detached)
                dataRow.Table.Rows.Add(dataRow);

            switch (dataRow.RowState)
            {
                case DataRowState.Added:                
                    <See code snippet for DataRowState.Added below>
                
                case DataRowState.Deleted:
                    <See code snippet for DataRowState.Deleted below>
                    
                case DataRowState.Modified:
                    <See code snippet for DataRowState.Modified below>
            }

            //If we need to read the value of an autoincremented 
            //key after an insert, begin a transaction.
            if (autoIncrementedKey)
            {
                Transaction.Begin();
                transactionBegun = true;
            }

            if (sql != "") 
            {
                command.CommandText = sql;
                ExecuteSQL(command);
            }                    
            
            if (autoIncrementedKey)
            {
                object newKey = null;
                Read(ref newKey, "Select @@Identity");
                Transaction.Commit();
                transactionBegun = false;

                dataRow.Table.Columns[keyColumns[0].ColumnName].ReadOnly = false;
                dataRow[keyColumns[0].ColumnName] = newKey;
            }

            //Accept the row changes so that the RowState will be Unchanged.
            if (dataRow.RowState != DataRowState.Deleted) dataRow.AcceptChanges();
        }
    }
    catch (System.Exception e)
    {
        if (transactionBegun) Transaction.Rollback();
        throw new Smum.Exception("Error occurred while writing data.", e);
    }
}

In a nutshell, this routine determines whether the row has been added, deleted, or modified, and generates the proper SQL statement to change the database accordingly. An ADO command object is used to execute the SQL statement.

To begin, we test to see if the row is unchanged. There is no need to update a row that hasn't changed. We then ensure that the TableName of the table the DataRow belongs to has been set, and we figure out what the key fields are for the table. This is done by examining the PrimaryKey property of the associated DataTable. This is one of the reasons we had to fill the schema of the DataSet in the Read routine. If we hadn't, this information would not be available. If there are no key fields for the table, we throw an exception.

We then instantiate a command object specific to the backend database, by calling the CreateCommand method of the dbTools object just as we had before when reading data.

After adding a detached DataRow to its associated table, we get down to the business of building the SQL statement. Here's the code snippet for an added DataRow:

case DataRowState.Added:
    sql = "Insert Into " + tableName;

    string columns = "(";
    string values = "(";

    for (int i = 0; i < dataColumns.Count; i++)
    {
        string column = dataColumns[i].ColumnName;

        //Don't include this column in the insert if it is an autoincrement column whose
        //value has not yet been set. A negative value will indicate that the value was
        //set by the DataSet and not by the database.
        if (!(dataColumns[i].AutoIncrement && Tools.ToInt32(dataRow[i], 0) <= 0))
        {
            //Add command parameter for value.
            values += dbTools.GetSqlParamName(column) + ", ";
            command.Parameters.Add(dbTools.CreateDataParameter(column, dataRow[i]));
            columns += dbTools.GetSqlColumnName(column) + ", ";
        }
        else
        {
            //If this autoincrement column is the primary key, we will read its value
            //from the database after inserting.
            if (column == keyColumns[0].ColumnName)
                autoIncrementedKey = true;
        }
    }

    //Remove the trailing commas.
    columns = columns.Remove(columns.Length - 2, 2);
    values = values.Remove(values.Length - 2, 2);

    sql += " " + columns + ") Values " + values + ")";
    break;

For an added DataRow, of course, we create an Insert statement. The values of the columns in the SQL statement will be specified using command parameter placeholders while we add a command parameter to the command object to indicate the value of the placeholder. E.g., our Insert statement will look something likes this:

Insert into Customers (FirstName, LastName) Values (@FirstName, @LastName)

In the above, @FirstName and @LastName are the placeholders for the column values, and they correspond to two command parameters with the same names that we will add to the command object. For the placeholders, I use the name of the column, prefixed by a backend specific character. For MS SQL Server and Access, this character must be an @, while for Oracle, it must be a colon. Since these characters are database specific, I use a method of the dbTools object, GetSqlParamName, to prefix them to the column names. And, since the command parameter objects are also database specific, these are instantiated using the CreateDataParameter method of the dbTools object.

Using command parameters in the SQL statement, instead of simply including the value in the SQL string, has the advantage of not requiring us to worry about how to format the column value for a particular backend database. Each database has a different syntax for things like date values, and by using a command parameter for these values, we let ADO.NET worry about these small differences. Additionally, it's the only way to write out a binary large object to the database.

The column names used in the SQL statement are specified using the GetSqlColumnName method of the dbTools object. This database specific method allows me to wrap all Access column names in square brackets. The same is done for the table name using GetSqlTableName. This allows certain reserved words in Access to be used for table and column names. For example, without the square brackets around the table name, we wouldn't be able to write data to a table called User.

Note, I am not bothering to write out the values of autoincrement columns, which have values that are less than 0. These are the columns whose temporary values have been set by the underlying DataSet, and we should let the database assign their values when the row is written. However, I am keeping track of when an autoincrement column is also a primary key column. In this case, after inserting the row, I will read the value of the key column from the database and update the DataRow with this new value. This doesn't have any use in this class, but is very important to the calling application if the application will later need this new primary key value as the foreign key of another DataRow to be written.

Here's the code for a deleted DataRow:

case DataRowState.Deleted:
    sql = "Delete From " + tableName + " Where";
    foreach (DataColumn keyColumn in keyColumns)
    {
        string column = keyColumn.ColumnName;
        sql += " " + dbTools.GetSqlColumnName(column) + " = " + 
                     dbTools.GetSqlParamName(column);

        command.Parameters.Add(dbTools.CreateDataParameter(column, 
                               dataRow[column, DataRowVersion.Original]));
    }
    break;

If we are writing out a deleted DataRow, we will actually be executing a Delete statement to delete the row from the database. In this case, I am identifying the row to delete using the primary key columns of the DataRow. Again, the values of the columns in the SQL statement are represented by command parameter placeholders, and we add the corresponding command parameters to the command object.

Here's the code for a modified DataRow:

case DataRowState.Modified:
    //Create an update statement.
    for (int i = 0; i < dataColumns.Count; i++)
    {
        //Only update the column if the data has changed.
        if (!dataRow[i].Equals(dataRow[i, DataRowVersion.Original]))
        {
            string column = dataColumns[i].ColumnName;
            sql += dbTools.GetSqlColumnName(column) + " = " + 
                   dbTools.GetSqlParamName(column) + ", ";

            command.Parameters.Add(dbTools.CreateDataParameter(column, dataRow[i]));
        }
    }

    if (sql != "")
    {
        //Add where clause for key columns.
        sql = sql.Remove(sql.Length - 2, 2); //Remove trailing comma.
        sql = "Update " + tableName + " Set " + sql + " Where ";
        foreach (DataColumn keyColumn in keyColumns)
        {
            string column = keyColumn.ColumnName;
            sql += dbTools.GetSqlColumnName(column) + " = " + 
                   dbTools.GetSqlParamName(column);

            command.Parameters.Add(dbTools.CreateDataParameter(column, 
                                   dataRow[column, DataRowVersion.Original]));
        }
    }
    break;

If the DataRow has been modified, I am creating an Update SQL statement. In this case, I am looping through each of the columns in the DataRow and only including a column in the Update statement if the value of the column has changed. I know the column has changed if the value of the column does not equal the original value of the column. The original value of the column is held in the DataRow, and can be accessed by passing the DataRowVersion.Original parameter to the indexer of the DataRow. Again, I am specifying the values of the columns in the SQL statement using placeholders and adding the appropriate command parameters to the command object.

Once the proper SQL statement has been created, all that is left is to execute the SQL statement. This is done by setting the CommandText of the command object to the SQL to be executed and calling the ExecuteSQL routine. The ExecuteSQL routine simply ensures the database connection is open and then executes the command by calling the ExecuteNonQuery method.

There is code before and after the call to ExecuteSQL in order to manage the case where we need to retrieve the value of an autoincremented primary key that has just been created. To do this, we need to start a transaction before the command is executed. After the command is executed, we can get the value of the new primary key by reading the scalar value returned by Select @@Identity. After retrieving this value, we then commit our transaction and update the value of the primary key in the DataRow. If we hadn't wrapped our write and read in a transaction, it would have been possible for another process to have added a new row to our table before we got a chance to read the new primary key. Using a transaction prevents any other process from writing a new row to our table until we commit. Since this is done almost immediately, there shouldn't be any perceived delay in the database's responsiveness. Note that we also need to ensure that the primary key column is not read before we try to update it.

Note that autoincrement columns in Oracle are achieved by using Oracle sequences and triggers and not by specifying a certain data type for the column. So, the code above to get the value of a new autoincremented primary key will not work for Oracle.

Lastly, we AcceptChanges on the DataRow so that the DataRowState of the DataRow will be changed to Unchanged and the original values of all the columns will equal their current values. Note that previously we had added the DataRow to its parent table if it was Detached. This was needed because the AcceptChanges method throws an exception if the DataRow is detached.

Transaction Handling

Transaction processing in ADO.NET is achieved by passing a Transaction object to those database reads and writes which should participate in the transaction. This Transaction object is obtained by calling the BeginTransaction method of the connection object on which to begin the transaction. Committing or rolling back the transaction is then a matter of calling the Commit or Rollback method of the Transaction object.

To provide a user friendly interface to this process, I've created an IOTransaction class which wraps a single IDBTransaction object. The Smum.IO class holds a reference to a single IOTransaction object which is instantiated in the Smum.IO constructor and is exposed through Smum.IO's Transaction property.

The IOTransaction class includes methods for beginning, committing, and rolling back a transaction on the associated connection. Thus, if you want to start a transaction on a given Smum.IO connection, it's as simple as this:

smumIO.Transaction.Begin();

Likewise, committing or rolling back the current transaction on a Smum.IO connection is accomplished like so:

smumIO.Transaction.Commit();

Or:

smumIO.Transaction.Rollback();

There's not much else to note in regards to transaction processing except when dealing with nested transactions. The SqlConnect object for MS SQL doesn't support nested transactions. Neither does the OleDbConnection object for Access or the OracleConnection object for Oracle. That doesn't mean these databases don't support nested transactions. It just means that the connection objects haven't implemented this functionality. Attempting to start a transaction on any one of these connections while another transaction is pending will result in an error message about parallel transactions not being supported.

Generally, I don't need nested transactions, but it is nice when creating a routine that requires a transaction, to simply start one without worrying about whether the calling routine has already started one. In this scenario, I typically don't want my transaction to be committed unless the calling routine's transaction is committed. To achieve this sort of functionality, I've introduced a nestCount variable in the IOTransaction class. This variable starts out at 0, and increases by 1 whenever the Begin method is called, and decreases by 1 whenever the Commit or Rollback methods are called. A transaction on the current connection is not begun unless the nestCount is 0, and an active transaction is not rolled back or committed unless the nestCount is 0. In this way, I can call the Begin method on IOTransaction multiple times, and only the first call will actually begin a transaction. Subsequent calls to Commit or Rollback will have no effect except for the one which matches up with the initial Begin. E.g.:

//nestCount = 0
smumIO.Transaction.Begin();     //Transaction begun. nestCount = 1
smumIO.Transaction.Begin();     //Nothing happens. nestCount = 2
smumIO.Transaction.Commit();    //Nothing happens. nestCount = 1
smumIO.Transaction.Commit();    //Transaction committed. nestCount = 0

Of course, these aren't real nested transactions, but it serves my purposes. It's important to note that you must bracket every transaction Begin call with either a Commit or a Rollback, and the entire transaction will be either committed or rolled back based on whether the first Begin is bracketed with a Commit or Rollback. Failure to do so will leave your data uncommitted. Also note that this scheme allows you to call an inner Rollback and then an ending Commit, having the effect of committing changes that you had thought were rolled back. Obviously, this is a danger and only shows that these are not true nested transactions.

Other methods

These other methods included in the Smum.IO class are also available and were not discussed above:

InitializeNewRow

This method will initialize a DataRow as a new row of the indicated table so that it may be used with the Write method in order to create a new row. E.g.:

DataRow row = null;
smumIO.InitializeNewRow(ref row, "Employees");
row["FirstName"] = "John";
row["LastName"] = "Doe";
smumIO.Write(row);        //Creates employee John Doe.

FillSchema

This method will fill a DataSet with the schema for a given SQL statement. This is used by InitializeNewRow.

Read (for scalar)

This is a simple Read method, which will read a single piece of data, typically a single column from a single row. I haven't discussed this above since it doesn't work like the other Read methods. It uses the ExecuteScalar method to execute the indicated SQL statement to read a single scalar value from the database.

Write (for scalar)

This is the corresponding Write method for the Read scalar method. This will write a single value to a single column in the database when the table name, column name, and Where clause are indicated. Since the Where clause may include more than one row, a single column for multiple rows may be updated at once.

Smum.Exception

This custom exception class is inherited from the System.Exception class only so that error messages may be formatted more clearly. The ToString method of this class will display the Message property of the current exception object as well as that of any inner exceptions. Any inner exception that is a System.Exception object will be prefixed with "[System Error]" in order to differentiate your application errors from .NET errors. The error messages will be listed with the outermost errors appearing first. Check out the accompanying sample project for an example of how this looks.

Conclusion

I wrote this article in order to demonstrate one way of approaching database IO in .NET. It doesn't do much, but it covers the basics. I hope it helps. In the future, I plan to post an article on how I took these basic functions and created a framework to support the creation of business logic classes.

History

Submitted: 4/3/2006.