Click here to Skip to main content
15,860,972 members
Articles / Programming Languages / SQL

The Entity Framework (v1 and v4) Deal Breaker: TPT Inheritance

Rate me:
Please Sign up or sign in to vote.
4.67/5 (6 votes)
18 Mar 2010CPOL5 min read 21.6K   12   2
So what’s wrong with EF’s TPT inheritance? The SQL that EF generates. With some simple testing (2 or 3 subclasses) everything may appear to be fine. In fact, you may even be impressed that EF manages to fetch all the data you need in a single trip, indeed, in a single query.

3/17/2010 Update: I have submitted this problem as a bug on Microsoft Connect. If you've been bitten by this bug, vote it up (click the “I can too” link indicating that you can reproduce the bug), and leave some helpful comments.

I'll keep the introduction short and sweet. Everyone says you'll eventually get burned by Microsoft data access strategies, because they eventually get abandoned. Well, my turn finally came. And it’s not that Entity Framework is being abandoned, it’s just that it DOESN'T WORK. 

Regarding inheritance mapping, EF advertises that it offers you 3 solutions:

  • TPT: Table per type
  • TPH: Table per hierarchy
  • TPC: Table per concrete class

I'll be focusing on TPT: Table per type.

Let’s say I have a line of business application where users answer a bunch of Yes/No questions, and depending on their answers, fill out a series of forms, each form having a unique set of fields that we want to capture end-user data for.

Q: Did you open a bank account?
Yes
Name of bank:
Date account opened:
Purpose of account:
etc.

EVERY form has a common set of fields as well: The end-user that the particular instance belongs to, the date it was created, whatever. Now, when I have a User object, and I want to see all the forms he’s filled out, I don't want to have to access 20 different collections of the User object, one collection for each form type:

C#
User user = User.GetByKey(20);
user.NewAccountForms;
user.TaxForms;
user.ReplacedLimbsForms;
user.CentralServicesForms;

I want an IList<FormBase> or something similar. This means I need to use inheritance, and each form needs to be derived from my FormBase class (actually named FormInstance in this case).

If you need this type of design, or think that your software at some point in the future will require this type of design, DO NOT use the Entity Framework. The Table-Per-Type inheritance is 100% unusable. If you feel that one of the other forms of inheritance suits you, well, you may be crazy, but just the same….tread lightly. (And please don't have 50 different discriminator values in a table that’s 200 columns wide…the SQL gods will strike you down).

So what’s wrong with EF’s TPT inheritance? The SQL that EF generates. With some simple testing (2 or 3 subclasses) everything may appear to be fine. In fact, you may even be impressed that EF manages to fetch all the data you need in a single trip, indeed, in a single query. Wow, their SQL generation must be top notch… But once you start adding some more subclasses, surprise! Your application grinds to a halt. So how bad is it? I decided to find out definitively.

Results

The setup is very basic. Here’s the Linq query I'm executing against my model:

C#
var q =
    from insts in context.FormInstances
    select insts;

List<FormInstances> instances = q.ToList();

Very basic. Give me the FormInstances. FormInstance is the base class. I started with just a single derived class, ran my test 500 times, averaging the time it took to execute my simple query. Then I added a 2nd derived class, a 3rd one, etc . The base class has less than 10 fields. Each derived class (to keep things consistent) is exactly the same, with 3 simple fields. Also, the tables are empty. Yes, empty. No data! So the performance numbers here strictly represent the time it takes EF to generate its abhorrent SQL, and the amount of time it takes SQL Server to grind it out.

Most of these are averaged over 500 executions. Some of the later ones were fewer, because it started to take over an hour to execute.

Num Derived ClassesAvg ms for query generation and executionLines of SQL Generated
11680
235106
367152
4119214
5203292
6340386
7523496
8784622
91137764
101608922
1121981096
1229581286
1339001492
1448281686
1559621892
20174123302
25412065112
301088077862

So what happens when your application has 30 subclasses? It takes 108 seconds to generate and execute the SQL to retrieve the records in them! That’s 3.6 seconds PER FORM! And that’s when the query returns ZERO RECORDS, because the tables are EMPTY. Why does it take so long? Well, look at the 7,862 lines of SQL it generates, to start with (I won't post it here; I'll spare you that).
Here’s what it looks like graphed:

Ouch! That sure doesn't look linear folks… And guess what, my “I got burned” experience doesn't have 30 form tables. It has 89. Based on the trend outlined above, it would take 97 minutes to generate and execute the SQL to select the data from those 89 tables. Funny, because 30 SQL statements querying those tables, even with WHERE fk IN(SELECT id FROM BaseTable) clauses on each one only take 132ms on the same computer… And it returns the same data.

Pragmatists vs Idealists

Joel Spolsky has some great words (scroll down to “The Two Forces at Microsoft”) regarding two developer camps at Microsoft; the Pragmatists, and the Idealists. Windows XP was a product of the Pragmatists; Vista, of the Idealists. Enough said. While the SQL generated by EF in cases such as these may call to the inner relational mathematician idealist in us, the result is impossible to read or easily understand, takes a long time to be generated, and even longer to execute. In other words, it’s 100% unusable.

Now, as a matter of research, I have to create this same test in NHibernate, to see what the generated SQL looks like, and to see how well it performs. I'll update this post when I have that data. Until then, if you decide to use EF’s inheritance “features”, here is a list of movies that are all 97 minutes long, so that your end users will have something to do while your site grates off your SQL Server’s face from the comfort of their browsers.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior) CB Richard Ellis
United States United States
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
AnswerOrigin of problem: DB/Query design Pin
Christian G. Becker27-Mar-19 7:22
Christian G. Becker27-Mar-19 7:22 
QuestionDid this ever get fixed/improved? Pin
alexed110-Apr-14 12:20
alexed110-Apr-14 12:20 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.