Introduction
It is well known that the ORM can help a lot in software development. In fact, they solve many problems that as developers we had in the past, specially talking about opening and closing connections.
But as ORM frameworks are a powerful tool, they can also make a mess if they are not used in the proper way. Let me give you an example.
In the project I'm current working on, some users started experiencing some slowness when trying to load a page. I was assigned to investigate what was happening and propose a solution. I started by doing some profiling in the web application using Glimpse, adding some Trace messages to determine where was the problem and I noticed that it was in a WCF service call that was taking 20 seconds to retrieve a single record with a lot of columns (because of all the needed joins).
This was by far unacceptable, so I started to doing further profiling on the Entity Framework using Entity Framework Profiler, and discovered that the slow service data was being fulfilled by a 4,000 lines query!.
I clearly understand that EF made a lot of extra and maybe unnecessary joins in order to automate the data extraction, but, 4000 lines?! Really?!
After inspecting the long long query that EF Profiler showed me, I started noticing some duplicated SELECT
statements. Then I went back to the service call and noticed that the code was something like this:
...
AsQueryable()
.Include("Account")
.Include("Account.Client")
.Include("Account.Client.Organization")
...
This seemed to be good on the first look, since we wanted to get the Account, the Client of that Account, and the Organization. But what EF is understanding is that we want to get the Account, and then
AGAIN the Account and the Client, and then
AGAIN the Account,
AGAIN the Client and the Organization. This then was causing that duplicated
SELECT
statements I was talking before. So to fix the problem, I deleted the extra lines and the code ended up like this:
...
AsQueryable()
.Include("Account.Client.Organization")
...
and then by that, I reduced the lines of the resulting query, removed all the duplicated SELECT
and reduced the execution time to 2 seconds!
So, as you can see, two extra lines can screw all the work you're doing. So be careful of preventing this and do profile your applications before they go live. CodeProject