![]() |
Platforms, Frameworks & Libraries »
LINQ »
General
Intermediate
License: The Code Project Open License (CPOL)
LINQ Challenges and SQL Server Compact EditionBy Matt SollarsOvercoming Challenges with LINQ to SQL and Using LINQ with SQL Server Compact Edition |
C# (C# 3.0), VB (VB 9.0), SQL, Windows (Win2K, WinXP, Win2003, Vista), .NET (.NET 3.0, .NET 3.5), SQL Server (SQL 2005, SQL CE), Visual Studio (VS2008), LINQ, Architect, DBA, Dev
|
||||||||||
|
Advanced Search Add to IE Search |
|
|
|
||||||||||||||||

Developers meet challenges all the time. It is a part of what we do. One of the most common challenges a developer faces is managing data. There are times when the data can be stored via a document format. However, a lot of us deal with data that has individuality; character; relationships…. You get the personified pun picture, I'm sure. Of course, I'm referring to relational data.
Relational data is so common in applications that the variety of choices we have to store it, organize it, query it, manipulate it, and so on, is astounding. Microsoft's own SQL Server 2005 product comes in six editions! Each provider has a slight variation on the SQL ANSI standard too. The point is, with so many options to choose from, it's no wonder why many developers put the majority of their effort into the "data access layer". By this, I am referring to the libraries, services, and/or code fragments written to communicate with the storage engine used for an application's data; regardless of how developmentally incorrect the scheme may be. That data is pointless just sitting in storage. We have to exchange it in and out with our applications for it to have any real use.
Enter LINQ to SQL (originally DLINQ). Now we can query all sorts of data with a language we are already familiar with (currently C# or VB). I was very excited when I heard about the release of LINQ (Language-Integrated Query); especially with regards to SQL. The potential to reduce code and create a system of consistency is invaluable to all developers. The beauty of LINQ is amplified by the latest enhancements in Visual Studio 2008.
LINQ also features an object-relational mapping (ORM) tool to make life even easier. We can point the tool at a database and enjoy cleanly generated code bliss. The code that it generates for us is extremely easy to extend; allowing us to write business logic, validation, helper methods, complex transaction scenarios, and more. I have been using object-relational mapping (ORM) for some time now and have struggled with several of the free implementations available. There are certainly some very well written ORM tools out there, but having one at your fingertips, inside the Framework, is next to greatness.
Couple that with all the other new features added to Visual Studio 2008, and LINQ is a welcome addition to the family. There are some new challenges with LINQ, but that should be expected. We're developers after all.
To begin discussing the challenges we can expect with LINQ to SQL, we need to establish a sample that I can use to demonstrate through. As developers, we have a need to quantify our output thoroughly. Some of us are independent contractors; some own consulting/development companies; others work for those companies. Regardless, there is a need to report the time we spend working on projects in order to accurately bill "the client". Therefore, what better sample for us to experiment with LINQ to SQL then with an application that tracks our time spent working.
This application will be different than the many out there already that allow you to input your time for projects. Our sample application is meant to actually track hours in real-time. I'd like it to be usable by non-developers as well. Any professional that is onsite or in the field with a laptop could benefit from clicking a couple of menus in order to get detailed time tracking. I even want the option to tell the program to detect when I am away from the computer, so it can pause and deduct that "away time" from the duration. That all said, I think it best we create a desktop application.
I also feel that this is a perfect opportunity to use SQL Server Compact Edition as our data storage engine. We could certainly use XML or even a flat text file, but I want the ability to quickly query sets of data for aggregate information and more. Why not use SQL Server Express Edition instead? Well, it would be really nice if the application could install without any system dependencies other than the obvious; the .NET Framework 3.5. Express Edition cannot be embedded into our application and requires more privileges to install than our user may have. This is especially true in a work environment.
For more information on choosing between SQL Server Compact and Express Editions, see the article and whitepaper published by Microsoft.
Also, please note that the sample application is, by no means, complete. It is only meant as an example scenario that leads us to a decision to use LINQ to SQL with SQL Server Compact Edition 3.5. The code available for download is a bare-bones application that showcases the topics discussed in the remainder of this article.
One of the challenges encountered by developers looking to get acquainted with LINQ to SQL, is how to use it with SQL Server Compact Edition (SSCE), version 3.5. We were told LINQ will work with SQL Server for now; with more providers on the way (possibly not until LINQ to Entities is released). However, if you try to drag some tables from a Compact Edition data connection onto a new LINQ to SQL designer canvas, you'll witness a nasty error dialog stating that the provider is not supported!
More accurately, the SSCE provider is not supported by the LINQ to SQL designer. You can still use the command-line tool, SQLMetal, to generate your data entities, data access, and other ORM code. If you prefer command-line tools, you're all set. If you like the idea of a visual representation of your entities, there is still hope.
I recommend creating a batch file or PowerShell script to generate your file(s) via SQLMetal. This allows you the benefit of a quick execution when you inevitably need to re-generate your data access layer due to schema changes. I've included one for download as an example.
For our sample application, the command needed is fairly simple.
SqlMetal.exe TimeApp.sdf /dbml:TimeApp.dbml /namespace:TimeApp.DataAccess /pluralize
Note: The SQLMetal tool is located, by default, on your primary drive at: Program Files\Microsoft SDKs\Windows\V6.0A\Bin\SqlMetal.exe.
Notice that I've specified a few options for the namespace of the generated code, to pluralize the entity class names, and to generate a DBML file. The generated DBML file is extremely important for those of you that want the visual designer support. With it, you can make minor changes that occur in your schema via the designer, or you can choose to edit the DBML file itself. It's just XML, so feel free to dive in. Once you add the file to your project, Visual Studio 2008 will automatically generate the corresponding code for your data access layer.
You can see the very simple layout of tables in the LINQ to SQL designer in Figure 2. The sample time tracking application will allow the user to select a project, and optionally a task, in order to begin tracking time. One note: the project and task tables are practically identical and could have been combined into a single table with a parent field to allow for hierarchy; however, I chose to separate them for this example, so you can see some of the challenges you may encounter with similar designs.
There are plenty of times when a simple numeric column with a constraint, or business rule, will do instead of a foreign key to a lookup table. Lookup tables are great if the values will change often or if the values need to be changed by the user. In other cases we, the developers, determine the exact domain of such a field. We still want to offer the user the ability to select one of the possible values, but we control those possibilities entirely. This is a perfect opportunity to use an enumeration in the application to represent those field values.
So, what's the problem? When the ORM generates the entities that represent your database's tables, it represents these numeric fields with numeric properties. When writing the user interface, we could translate the user's selections to the appropriate numbers. If we want to use an enumeration instead, we still have to translate the enumeration values to the corresponding numeric values. Your first instinct may be to use the power provided by the ORM to create a new code file with a partial class that includes a new property that handles these translations.
C#:public enum EstimationComparison : byte {
Overall = 1,
Repeating = 2
}
public partial class Project {
public EstimationComparison EstimationComparison {
get { return (EstimationComparison)this.EstimationComparisonType; }
set {
if (Enum.IsDefined(typeof(EstimationComparison), value))
this.EstimationComparisonType = (byte)value;
}
}
}
VB:
Public Enum EstimationComparison As Byte
Overall = 1
Repeating = 2
End Enum
Partial Public Class Project
Public Property EstimationComparison() As EstimationComparison
Get
Return CType(Me.EstimationComparisonType, EstimationComparison)
End Get
Set(ByVal value As EstimationComparison)
If [Enum].IsDefined(GetType(EstimationComparison), value) Then
Me.EstimationComparisonType = CType(value, Byte)
End If
End Set
End Property
End Class
Project entity to include a new property. The new
property translates the existing numeric property (EstimationComparisonType) value to
and from an enumeration (EstimationComparison).
This is good and makes use of the tools we're given by the ORM for LINQ. However, we can do better
than that. We can actually tell the ORM to use a different type for the property and it will handle
the conversions for us. As long as the type we inform it to use is compatible with the data type used
to store the values in the database, there will be no problems. This can be accomplished in a couple
of ways. One is to edit the DBML file manually and use SQLMetal to regenerate the code from the edited
file. The other way is to open the DBML in the designer and change the type of the entity's property
via the property page. Since we also want this field to be Nullable, we have to make
sure to set the property type accordingly.
Type of the property for an entity to anything compatible with the
Server Data Type. For this example, I've changed the Type to a Nullable
enumeration. If you are using C#, you can set this to EstimationComparisonType? as a
shortcut.
Once this change is made, we can make a direct assignment to the EstimationComparisonType
property of the Project entity with one of the enumeration values. The Framework takes
care of conversion.
Project Project = DataContext.Projects.Single(P => P.ProjectId == 1);
Project.EstimationComparisonType = EstimationComparisonType.Overall;
VB:
Dim Project As Proejct = DataContext.Projects.Single(Function(P) P.ProjectId = 1)
Project.EstimationComparisonType = EstimationComparisonType.Overall
One thing we should have learned early in our database development careers is to restrict the set of data we retrieve to just what we need and nothing more. That means we should limit our rows via a well-thought "where" clause and our columns via an actual column list in the "select" clause of our SQL statements.
-- No restrictions at all.
select *
from Project
order by [Name]
-- Limit the columns to just those needed.
-- Limit the rows as needed.
select ProjectId, [Name]
from Project
where IsActive = 1
order by [Name]
There are two different queries of the Project table in Listing 3. If our intent for this
data is to fill a simple list control, we do not need all the fields in the table. All we really care
about for our list is the primary key and the textual representation of each record. The second query
is certainly more ideal.
In beta versions of LINQ to SQL, we were able to force which fields were loaded into the entity object we were retrieving. It turns out, this was a bug. To avoid confusion, this so-called "feature" was never meant to be. The thinking is, if one developer calls on data from a library, layer, or other code written by another developer, there would be mass confusion if that retrieved entity had missing data in the properties.
For example, if I call a method from a library that another developer wrote that retrieves our project
list and only gets the ProjectId and Name, I could become very lost when I try
to access the EstimatedDuration property. Things would be bad enough if a field is not nullable
and has uninitialized data, but imagine the EstimatedDuration property which is
nullable. I may presume that there is no such value in the database, when in fact, there may be. I have no
way of knowing based on the Product entity I received.
Therefore, when LINQ to SQL was released to manufacturing (out of beta to release), this kind of thing is no longer allowed. I still need the ability to retrieve just some fields from a table though, and I can. I can use anonymous types locally, but if I want to pass the results around, I need to create a new class to hold just those values I desire and then use a new instance of the class when retrieving the data.
C#:public class ProjectSummary {
public int ProjectId { get; set; }
public string Name { get; set; }
public List<TaskSummary> Tasks { get; set; }
}
VB:
Public Class ProjectSummary
Private _projectId As Integer
Private _name As String
Private _tasks As List(Of TaskSummary)
Public Property ProjectId As Integer
Get
Return _projectId
End Get
Set(ByVal value As Integer)
_projectId = value
End Set
End Property
Public Property Name As String
Get
Return _name
End Get
Set(ByVal value As String)
_name = value
End Set
End Property
Public Property Tasks As List(Of TaskSummary)
Get
Return _tasks
End Get
Set(ByVal value As List(Of TaskSummary))
_tasks = value
End Set
End Property
End Class
ProjectSummary class is created to hold a subset of the field values
from the table.var ProjectList = from P in DataContext.Projects
orderby P.Name
select new ProjectSummary {
ProjectId = P.ProjectId,
Name = P.Name
};
VB:
Dim ProjectList = From P In DataContext.Projects _
Order By P.Name _
Select ProjectSummary = New With { _
.ProjectId = P.ProjectId, _
.Name = P.Name _
}
ProjectSummary class to hold the fields we want.
When we create a new class to hold just the fields we want, it is fairly easy to adjust our LINQ syntax
to get the data into a new instance of our class. In our example, Listing 4 declares a new ProjectSummary
class with just ProjectId and Name properties and Listing 5 shows the query
to fill a new instance of ProjectSummary with the appropriate data.
Change tracking with LINQ to SQL can be quite challenging in certain cases; especially, if you are just starting out with the new technology. Let's take a look at such a case with the sample time tracking application.
The "Projects" window retrieves a summary list of current projects from the database via the
generated data context. The fetched project list is bound directly to the list box seen in Figure 4.
When the user presses the "Edit" button, the entire Project is retrieved and
passed to the "Edit Project" window.
Tasks.
The "Edit Project" window receives an instance of a Project entity object and binds
all its properties to the controls on the form, except for the Tasks child list. If we bind
the Project.Tasks list to the list box, we will have no way to undo changes made to that list
if the user chooses to "Cancel" the dialog box.
Despite the fact that the data context can track object changes in order to determine what has been modified, there is no public method we can call to "undo changes" made thus far. There are a few solutions to this drawback.
Firstly, you can instruct the ORM to inherit the entity classes from a base class of your choice. It is within your power to create a base class that supports deep cloning. With this ability, you could clone the original state of the object when any property is changed. Furthermore, you could add a method to undo the changes made to the object that will revert it back to the original values. However, implementing deep cloning is a bit intensive and would certainly result in performance loss. This would be compounded with each child object depth. Of course, you could implement a maximum depth and allow it to be specified on a case-by-case basis. Good luck with this one!
Alternatively, you can take the approach of flagging objects for modification and/or deletion. Our example
application is most concerned with undoing deleted tasks. We could initialize a separate list of the
project's tasks. That way, when the user presses the "Delete" button for a task, the Task
object is "marked" for deletion and removed from our separate list (which is bound to the
list box). This will leave the object in the original Project.Tasks list, but remove it from
the list used to display the tasks in the list box. If the user clicks the "OK" button of the
"Edit Project" dialog, we can delete those tasks that have been marked for deletion before submitting
changes to the project.
Another issue some developers encounter involves disconnected scenarios. Suppose you retrieve an entity or collection of entities via a LINQ to SQL data context in a separate physical layer. In a case like that, the original data context used to retrieve the data will cease to exist by the time it travels to the user interface layer and back again for updates. There is some support for such a very common architecture.
If you prefer to use the generated entity classes that the ORM built, you can pass instances of those
between your layers and get results. Beware of the limitations though. You can re-attach the objects
to a new data context for updates. However, the Attach method of a context's Table
instance does not support a deep object graph. In fact, it only supports itself. That is, when you attach
an object, only that object is attached. If your object has child objects related to it via foreign keys,
those objects will not be attached. You must attach the child objects individually in order
for them to participate in an update.
LINQ to SQL usually uses the original state of an object to maintain optimistic concurrency control
during updates. Since, the object tracking is severed during the disconnected scenario, you will have
to provide that original object, or an alternative, when re-attaching the object for updates. The re-attachment
must receive the original object in addition to the modified one or your
entity must have a version member. A version member is typically defined as a timestamp
field in your database, and you will definitely want to flag the generated property as read only.
These requirements are necessary for LINQ to SQL to determine if the data you wish to update has already been modified by another user. Keep all of this in mind if you wish to pass the generated entity objects between your layers as opposed to messages.
LINQ to SQL has many great performance tweaks in-place. A lot of thought went into the way data is retrieved.
If you take a moment to examine the SQL that is generated while in debug mode or via the data context's
Log property, you will see evidence of the care put into the statements.
Another performance feature is deferred loading, or "lazy loading". Deferred loading prevents
certain data from being retrieved before it is actually needed. There is an option available per property
on your entities for "delay loading" the property value. However, all child objects default
to this behavior. In our example application, the Project entity has a property for the
Tasks defined under it. When a project is retrieved via LINQ to SQL, only the data from the
corresponding table is retrieved. If, however, I access the Tasks property, another query is
issued to the database to retrieve those tasks. This is by design to prevent unnecessary load on the database
for "just in case" scenarios. However, there are times when we know that we want the
child objects to be loaded.
The time tracking application displays a menu on its notification icon (system tray icon) containing all the projects in the database. Each project menu item also has a list of child menu items for each associated task. Knowing that we need to build these menus ahead of time affords us the vision to pre-fetch all necessary tasks at the same time we get all the projects. If we can instruct LINQ to do all of this in one query, we can save some performance from multiple queries (1 query instead of 1 for the projects, plus 1 for each project to get the tasks).
We will use a feature of LINQ to SQL to achieve this "eager loading". The data context has
a property named, LoadOptions, that accepts an instance of DataLoadOptions.
This is how we specify that we want to go ahead and load one or more related child entities in one query.
// Pre-fetch the tasks for each project.
DataLoadOptions Options = new DataLoadOptions();
Options.LoadWith<Project>(P => P.Tasks);
DataContext.LoadOptions = Options;
var ProjectList = from P in DataContext.Projects
orderby P.Name
select new ProjectSummary {
ProjectId = P.ProjectId,
Name = P.Name,
Tasks = (
from T in P.Tasks
orderby T.Name
select new TaskSummary {
TaskId = T.TaskId,
Name = T.Name
}
).ToList()
};
VB:
' Pre-fetch the tasks for each project.
Dim Options As New DataLoadOptions()
Options.LoadWith(Of Project)(Function(P) P.Tasks)
DataContext.LoadOptions = Options
Dim ProjectList = From P In DataContext.Projects _
Order By P.Name _
Select ProjectSummary = New With { _
.ProjectId = P.ProjectId, _
.Name = P.Name, _
.Tasks = ( _
From T In P.Tasks _
Order By T.Name _
Select TaskSummary = New With { _
.TaskId = T.TaskId, _
.Name = T.Name _
} _
).ToList() _
}
LoadOptions
property of the data context. In this case, we choose to pre-fetch the child Tasks of each
Project retrieved in the query.
This is a truly great feature! However, when we examine the generated SQL after making this setting, there
is a potential performance problem. Notice in Figure 6 how the SQL that LINQ generates uses a join to
get the data from both the Project and Task tables.
Projects and Tasks at once.
My initial response to seeing the above generated SQL was a bit of surprise. My reason for surprise
is due to the data retrieved. Each field selected from the Project table will be duplicated for
each Task row retrieved.
Tasks exist for a Project, all the fields selected from the
Project are duplicated for each Task also retrieved.I expected to see 2 queries; one to retrieve the necessary projects and another to retrieve the matching tasks for those projects (via a correlated subquery if a "where" clause existed for the project query). I expected the results of the 2 queries would then be used to map tasks to projects when building the object graph.
Not being one to just accept such results, I e-mailed Scott Guthrie, a general manager at Microsoft who has written several blog posts about LINQ. I asked him why this behavior is found in LINQ for pre-fetching related entities. Scott deferred the question to Dinesh Kulkarni, a program manager on the LINQ project. Dinesh works specifically on LINQ to SQL. His response summarized the many options his team considered with regards to performance and eager loading, or pre-fetching. The team apparently ran several tests on different sets of data of varying sizes. Some solutions worked well on some data sets and not on others. After much thought, they opted for the current solution, even though he expressed their unhappiness with the performance.
Dinesh also referenced a detailed blog post by the senior performance architect, Rico Mariani. I highly recommend reading the blog post when you have some free time. The post does touch on some of these very issues of performance.
After reading Dinesh's reply and the blog post, I came to the conclusion that there is no "silver bullet". I can absolutely see times where multiple queries would not yield good performance due to correlated subqueries. Imagine a parent query that has some complicated filters in the "where" clause. Each child query that is pre-fetched will have to have a subquery that correlates it with the parent query with the same complicated filters. That kind of behavior would compound the complexity greatly in those situations.
It is great that you can respond to certain events inside your entities. While exploring some of these options, I experimented with the idea of setting some default values for a new object when setting one of its properties to a related entity instance. This worked surprisingly well.
C#:public partial class Task {
partial void OnCreated() {
// Subscribe to the PropertyChanged event when created.
this.PropertyChanged += new System.ComponentModel.PropertyChangedEventHandler(Task_PropertyChanged);
}
void Task_PropertyChanged(object sender, System.ComponentModel.PropertyChangedEventArgs e) {
if (e.PropertyName == "Project") {
if (Project != null && TaskId <= 0) {
// Set the defaults of this new Task based on the assigned Project.
EstimationComparisonType = Project.EstimationComparisonType;
DetectAway = Project.DetectAway;
IsBillable = Project.IsBillable;
}
}
}
}
VB:
Partial Public Class Task
Private Sub OnCreated()
' Subscribe to the PropertyChanged event when created.
AddHandler Me.PropertyChanged, AddressOf Task_PropertyChanged
End Sub
Sub Task_PropertyChanged(sender As Object, e As System.ComponentModel.PropertyChangedEventArgs)
If e.PropertyName = "Project"
If Not Project Is Nothing AndAlso TaskId <= 0
' Set the defaults of this new Task based on the assigned Project.
EstimationComparisonType = Project.EstimationComparisonType
DetectAway = Project.DetectAway
IsBillable = Project.IsBillable
End If
End If
End Sub
End Class
Task entity instance
when it is assigned a parent Project value.
The code in Listing 7 will be executed if a new instance of a Task object is assigned
an existing Project value via its related property. In such a case, the Task
will have some of its properties set to defaults from the parent Project object.
One other thing to look out for when working with LINQ to SQL is a very obscure error that I received
a couple of times. This error is more prone to occur when you edit the schema of your database without
updating the DBML file. I received a ChangeConflictException with the message,
"Row not found or changed", several times while running the sample application. It turned
out to be something very simple. I changed the sample schema to allow nulls for a couple of fields and
forgot to update the DBML file for my LINQ entities. All I had to do to fix this was to indicate that
nulls were allowed for the corresponding entity property. In the case of my custom enumeration property,
I also had to use the System.Nullable generic type when specifying the enumeration type. I
expected the ORM, or DBML designer, to imply the Nullable type when I set the property's
Nullable flag to "True". However, it did not, so I had to set the property's
Type as indicated earlier, "System.Nullable<EstimationComparisonType>".
In a field filled with acronyms, one of the newest to the .NET Framework is extremely powerful. LINQ offers us a simple interface to our data. Despite the challenges it creates, it can be a very valuable tool in a developers "kit". Using LINQ to SQL with SQL Server Compact Edition should certainly speed up the development process for your next application requiring embedded data. All you need is a little knowledge on making it work best and some free time to give it a shot.
Good luck and have fun!
General
News
Question
Answer
Joke
Rant
Admin
|
PermaLink |
Privacy |
Terms of Use
Last Updated: 25 Mar 2008 Editor: |
Copyright 2007 by Matt Sollars Everything else Copyright © CodeProject, 1999-2009 Web21 | Advertise on the Code Project |