The BeginLoadData problem and why OOP doesn't work






3.26/5 (15 votes)
A foray into a specific problem while discovering at the same time the reasons for why OOP often fails us.
Introduction
This article takes a specific problem and looks at a variety of possible solutions, while discovering at the same time why it is that OOP doesn't always (or rather, hardly ever) work.
The Problem
Contrary to the documentation for the DataTable
's BeginLoadData
method:
Turns off notifications, index maintenance, and constraints while loading data.
it actually does not disable notifications (events). I discovered this as I was using my DataTable Transaction Logger and the corresponding Synchronization Manager. The synchronization manager dutifully called BeginLoadData
, yet the events were still firing in the transaction logger for the associated DataTable
! The effects can be demonstrated with this simple program:
using System;
using System.Data;
using System.Reflection;
namespace BeginLoadDataTest
{
class Program
{
static void Main(string[] args)
{
DataTable dt = new DataTable();
dt.ExtendedProperties.Add("BlockEvents", false);
DataColumn c1 = new DataColumn("FirstName", typeof(string));
DataColumn c2 = new DataColumn("LastName", typeof(string));
DataColumn c3 = new DataColumn("PK", typeof(Guid));
dt.Columns.AddRange(new DataColumn[] { c1, c2, c3 });
dt.PrimaryKey = new DataColumn[] { c3 };
dt.TableNewRow += new DataTableNewRowEventHandler(OnTableNewRow);
dt.ColumnChanging += new DataColumnChangeEventHandler(OnColumnChanging);
dt.BeginLoadData();
DataRow row = dt.NewRow();
row["FirstName"] = "Marc";
row["LastName"] = "Clifton";
// BeginLoadData turns of constraint checking, so we can add the row
// without setting the PK.
dt.Rows.Add(row);
row["PK"] = Guid.NewGuid();
dt.EndLoadData();
}
// BeginLoadData does not turn off event notification!
static void OnColumnChanging(object sender, DataColumnChangeEventArgs e)
{
Console.WriteLine("Column Changing");
}
static void OnTableNewRow(object sender, DataTableNewRowEventArgs e)
{
Console.WriteLine("New Row");
}
}
}
Which clearly shows the problem:
A Solution
I considered a variety of solutions to this problem.
Unhooking the Events
This would be the ideal solution, except that there is no way for the application to know how many loggers are associated with a specific DataTable
. Tracking this is a possibility that I look at later. Nor, to my knowedge, does .NET provide you with the means of getting the list of event handlers associated with a multicast event, unhooking them, and then rehooking them later. So unfortunately, this rather simple solution does not work for my situation.
Subclassing DataTable
Subclassing is a possible solution, but requires that you declare new methods for BeginLoadData
and EndLoadData
:
public new void BeginLoadData()
{
// set a flag
loadingData=true;
}
public new void EndLoadData()
{
// clear the flag
loadingData=false;
}
and then override the Onxxx
methods, for example:
protected override void OnColumnChanging(DataColumnChangeEventArgs e)
{
if (!loadingData)
{
// allow events to fire:
base.OnColumnChanging(e);
}
}
but this did not appeal to me because it would require that the application always use the subclass. The idea with the transaction logger is that it's supposed to work with your DataTable
, not mine.
Proxy by Wrapping
Another idea was, well, why I don't I encapsulate the DataTable in a proxy class that implements transparently all the same functions as the DataTable
, except that it does some special behavior for the BeginLoadData
and EndLoadData
methods. The idea of cloning all the properties, methods, and events was not just time consuming but also an ugly solution, and it still had the problem of requiring the programmer to use the proxy class, not the actual DataTable
class.
Dynamic Proxy
There are several good articles on dynamic proxies, and there's also the Spring Framework, but these all require that the class you are proxying implements an interface of all the methods that can be proxied. It also requires that the programmer use the interface rather than the class when invoking methods. Neither of these were options because the DataTable
class doesn't implement an interface for the methods in question. And yet again, it would require changes to the construction and usage of the class.
Tracking Tables
I considered the idea that I could have a master list of all the loggers associated with DataTable
instances. The logger could add and remove itself to this collection, making the process autonomous. Then, the event handlers, also in the logger, could check whether or not the table was in a load data state, with a flag set by the synchronization manager associated with tables in the collection.
This is where automatic garbage collection becomes annoying. If the DataTable
instance is no longer in use by the application, it won't be collected because there's still an instance of it in this hypothetical collection. Only when the object managing the collection is GC'able will the table instance be collected (we hope). So, this would require the programmer to explicitly call a kind of Dispose method provided by whatever object is maintaining the collection, to remove the table from the collection so it could be, um, collected. Now, if this were C++, the programmer would have to be explicitly deleting the table to begin with, and I could possibly hook the destructor and do my own additional management cleanup.
Reflection
There is a non-public property that can be accessed via reflection: SuspendEnforceConstraints
. This has the drawback of using reflection (a performance hit) and making the code dependent on the internal implementation of the DataTable
, which might change in future versions of .NET. This was a really close call, but then I stumbled across what I felt was a better solution.
ExtendedProperties to the Rescue
OK, whoever decided this needed to be in the DataTable
class was a genius. The implementation that I chose to correct this problem is to add a property "BlockEvents
", that the event handlers inspect to determine whether or not to proceed with the handler. In my particular case, I can completely hide this behavior in the transaction logger and synchronization manager, without resorting to other more intrusive techniques. With a few lines of code, I can implement the solution, without touching the actual application code.
Here's the revised demo:
using System;
using System.Data;
using System.Reflection;
namespace BeginLoadDataTest
{
class Program
{
static void Main(string[] args)
{
DataTable dt = new DataTable();
dt.ExtendedProperties.Add("BlockEvents", false);
DataColumn c1 = new DataColumn("FirstName", typeof(string));
DataColumn c2 = new DataColumn("LastName", typeof(string));
DataColumn c3 = new DataColumn("PK", typeof(Guid));
dt.Columns.AddRange(new DataColumn[] { c1, c2, c3 });
dt.PrimaryKey = new DataColumn[] { c3 };
dt.TableNewRow += new DataTableNewRowEventHandler(OnTableNewRow);
dt.ColumnChanging += new DataColumnChangeEventHandler(OnColumnChanging);
dt.BeginLoadData();
dt.ExtendedProperties["BlockEvents"] = true;
DataRow row = dt.NewRow();
row["FirstName"] = "Marc";
row["LastName"] = "Clifton";
// BeginLoadData turns of constraint checking, so we can add the row
// without setting the PK.
dt.Rows.Add(row);
row["PK"] = Guid.NewGuid();
dt.ExtendedProperties["BlockEvents"] = false;
dt.EndLoadData();
}
// BeginLoadData does not turn off event notification!
static void OnColumnChanging(object sender, DataColumnChangeEventArgs e)
{
if (!(bool)((DataTable)sender).ExtendedProperties["BlockEvents"])
{
Console.WriteLine("Column Changing");
}
else
{
Console.WriteLine("Column Changing is being ignored");
}
}
static void OnTableNewRow(object sender, DataTableNewRowEventArgs e)
{
if (!(bool)((DataTable)sender).ExtendedProperties["BlockEvents"])
{
Console.WriteLine("New Row");
}
else
{
Console.WriteLine("New Row is being ignored");
}
}
}
}
Now granted, this solution isn't necessarily ideal for your situation. In my case, I had a very compartmentalized set of classes for managing data transactions--there is only one place where BeginLoadData
is called, and that's in the synchronization manager. If you have code that calls BeginLoadData
in many places, one of the other options that I considered might be better, especially unhooking the events.
Conclusion
Besides the specific discussion of the problem with BeginLoadData
, in general, this points out the flaws in object oriented programming. Of course, it would have been really nice if the behavior of BeginLoadData
matched the documentation.
Virtual Methods
It would have been useful if these two methods had at least been declared as virtual, so that I could easily override the default (broken) behavior. But since they're not, I can't really extend the behavior the way I want. So much for the reusability holy grail of OOP, when class designers choose performance (is it really that much better?) over extensibility.
Interfaces and Proxies
An argument could be made that in a language that does not specifically support Aspect Oriented Programming, the programmer should at least consider writing an interface that allows other developers to proxy the class. This is especially true for framework classes, and it incurs no penalty in performance if the proxy isn't used.
Extended Properties
This is a really useful feature that I wish more classes provide. Sure, there's the Tag
property in a lot of classes (especially controls), but a property collection is much more powerful and keeps one from accidentally using the Tag
property for conflicting information. Again, when developing a class that's going to go into a framework, this is a good concept to consider implementing to make life easier for the next guy.