Click here to Skip to main content
12,887,921 members (44,438 online)
Click here to Skip to main content
Add your own
alternative version


23 bookmarked
Posted 25 Apr 2010

The Deep Table

, 4 May 2010 CPOL
Rate this:
Please Sign up or sign in to vote.
A natural way of joining tables using SQL


When working with relational databases, it is almost impossible not to talk about SQL. With this language, a huge variety of operations and queries can be performed on a database and, depending on its implementation and use, in a very efficient way. Nevertheless, although it is clearly a human readable language by design, it soon starts being complicated to be read. For example, the select query:

SELECT id, date, customerId FROM tSales

reflects perfectly the intention of the query, that is:

"want from the table tSales: id, data and customerId"

But, this other query:

SELECT id, date, customerId, FROM tSales, _
	tCustomer WHERE tSales.customerId =

Although it is also explicit, somehow upsets the developer's eyes. We specify the way the record from the table tCustomer has to be accessed, while in a given schema it usually would be the only way to do that. In this case, the intention is:

"want from the table tSales: id, data, customerId and the customer's name"

Something like "customer's name" could be implemented if we previously had defined the relationship tSales <-> tCustomer. But not in SQL, since this kind of relationship between tables cannot be included in the database schema, at least in a way that could be used in SELECT queries. But as usual, when the language cannot achieve something, we have the code generation.


This article will only treat the problem of building SELECT statements and more specifically for joining tables. Well, the problem is not new and therefore there are many tools and libraries in different languages that help in this matter. Usually graphical solutions are preferred, but when the database schema becomes large (real world) the graphic itself starts to be another problem, and in general does not provide much better documentation than the generated and "unreadable" query.

Here, it is not really presented a tool for all databases in the market, but the idea of a "deep select" against only one table (deep table), the algorithm to convert it in a standard SQL and the way a graphical interface can use it to facilitate the development as it is made in the attached demo.

The Deep Table

Assuming we have defined table connections in a database, the meaning of the deep table is almost trivial: A table where not only its columns can be selected but also the columns of the connected tables and if these are connected as well, the columns of their connected tables and so on.

For example, having the database schema:

table tISOCountries  (isoA2, isoA3, number, name)
table tCompanies     (id, name, countryId)
table tProducts      (prodID, providerID, name, price)
table tSales         (id, date, customerId, productId, quantity)

We want to be able to express a deep query over the table tSales like:

Deep select on table tSales:

customer name
customer country name
product name
product provider name
product provider country name

That is, just a set of "deep columns" where 'customer', 'product', 'provider' and 'country' are the supposed connections, and the last item a real column of a real table. This syntax is pretty readable, at least in English language while in Spanish for example it would be better a reverse order of the words. For example,

deep column : customer country name
english     : Customer's country name
spanish     : name del country del customer = Nombre del pais del cliente

Although this reverse order could be interesting as well, we will keep it in English (don't worry. ;)

A query that can be generated with the previous deep select in SQL92 syntax is:

SELECT AS id , AS date , AS customer_name , AS customer_country_name , AS product_name , AS product_provider_name , AS product_provider_country_name ,
   tSales.quantity AS quantity
   tSales ,
   tCompanies AS customer ,
   tISOCountries AS customer_country ,
   tProducts AS product ,
   tCompanies AS product_provider ,
   tISOCountries AS product_provider_country
   tSales.customerId ==  AND
   customer.countryId == customer_country.isoA2  AND
   tSales.productId == product.prodID  AND
   product.providerID ==  AND
   product_provider.countryId == product_provider_country.isoA2

And a demo data result of the query:

Possible result table

Note that the generated column names as well as the table aliases are pretty readable. Not all the systems that generate code have this feature which is very important in the development.


A connection between two tables is nothing new, for instance SQL92 describes these connections using foreign keys. However, although syntactically and conceptually can be the same as the connection I present here, the use of the foreign key is strictly for Referential Integrity, that is, used as constraint on write operations like INSERT, DELETE and UPDATE. The connections that deep table needs are only thought for SELECT queries, therefore its definition is less restrictive than foreign keys. On the other side, if the foreign keys were given in the schema, there is nothing against using them as additional connections, thus avoiding double definitions in these cases.

Like in foreign keys, a connection can be defined with a set of columns of a table (source table) that reference a record of another table (target table). The following structure defines an entry of a connection:

connectionName, sourceTable, sourceKey, targetName, targetKey

where connectionName is what we will use in the deep selection (deep columns), and it is defined with two tables and the column(s) that connect both. If the connection needs more than one column, then it is enough to use more entries with the same connectionName, sourceTable and targetTable.

connectionName, sourceTable, sourceKey  , targetName   , targetKey
   customer      , tSales     , customerId , tCompanies   , id
   product       , tSales     , productId  , tProducts    , prodID
   provider      , tProducts  , providerID , tCompanies   , id
   country       , tCompanies , countryId  , tISOCountries, isoA2

Here is a graphical representation of these connections according to the given schema:

Connections graphically

It is also possible to think about not specifying the sourceTable within a connection, for example:

connectionName, sourceTable, sourceKey  , targetName   , targetKey
   country       ,            , countryId  , tISOCountries, isoA2

In this case, all tables having a column countryId are connected to the table tISOCountries automatically. This is an "open connection" and this is how it can be represented:

Open connection

Algorithm for Deep Select

The main goal of the "deep table" is to generate a SQL select statement like the one presented above in the section "The deep table". For this generation, we do not even need the real schema of the database, from a set of connections and a "deep select" (set of deep columns of a table), we can generate the SQL as a text like it will be described here.

For the algorithm, it is convenient to define the "deep column" structure as follows:

baseTable, [ connection1, ..., connectionN, ] columnName

With this structure, the base table is the same for all deep columns, in our example it would be tSales.

If we take a look at the select query of the given example, we see that the problem can be reduced to getting three lists, one for each clause : SELECT, FROM and WHERE. Initializing the FROM list with "baseTable", the rest can be done applying the following algorithm to each deep column:

Deep column reduction algorithm

After having the three lists, building the final SQL is straightforward.

Note that duplicate entries specially in the "FROM" list has to be avoided. Also note that with this algorithm, all aliases will start with the initial alias (baseTable). For instance, it would produce " AS tSales_customer_name" instead of " AS customer_name" as it is given in the example. If you prefer shorter alias names, it is possible to remove the prefixes starting with "baseTable_" during the algorithm.

Deep Schema

If having connections as defined in this article, it is convenient to work with an extended database schema that reflects these connections between tables. In our example, a simple extended schema would be:

table tISOCountries :


table tCompanies :

   o--o Connection [country] to table [tISOCountries] through (countryId)

table tProducts :

   o--o Connection [provider] to table [tCompanies] through (providerID)

table tSales :

   o--o Connection [customer] to table [tCompanies] through (customerId)
   o--o Connection [product] to table [tProducts] through (productId)

It can also be useful for a tool to develop "recursively" all connections of a table, either into a list form or in a more graphic way like using a tree. For example, for the table tSales:

Deep table schema

Here, it has to be taken into account that a recursive connection (table connected to itself) or some other cyclic recursion would generate an infinite list which has to be avoided. At least in the list form while in the tree, it can be accepted if this is loaded dynamically on demand (e.g. when opening a node).

A tool with these elements can generate or even execute a deep SQL just by selecting the desired columns.


Without adding new big concepts, just the deep column as a way to access related information from a table (e.g. "connection connection ... colName") we can perform lots of queries that would result much more complex using directly SQL because of the associated joins. All in a quite natural way (e.g. customer name = customer's name) and producing queries with readable column names (customer_name). It is also a way of seeing a record of a table containing not only values (e.g. "id", "date") but also "objects", where the connections are the objects, for instance "customer name" can be considered as the access to the field "name" of the object "customer" (e.g.

Our generated SQL make use of INNER JOINS, note though that this can be considered a matter of implementation. The algorithm can be adapted in order to generate other join syntaxes.


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


About the Author

I have a Telecomunications Engineering degree but since I learnt the first language (Pascal), almost I haven't stopped developing software. Mainly with C and C++ but I have touched many other languages and scripts. I use also to develop for fun (reinventing the wheel, of course!), currently almost all of such programs and tools are written in java (visit

You may also be interested in...

Comments and Discussions

GeneralNice concept - but has been done before Pin
Colin Eberhardt28-Apr-10 3:03
memberColin Eberhardt28-Apr-10 3:03 
GeneralRe: Nice concept - but has been done before [modified] Pin
Alejandro Xalabarder28-Apr-10 11:05
memberAlejandro Xalabarder28-Apr-10 11:05 
Generalhas been done before ... but HOW Pin
Alejandro Xalabarder28-Apr-10 12:37
memberAlejandro Xalabarder28-Apr-10 12:37 
GeneralRe: has been done before ... but HOW Pin
Colin Eberhardt28-Apr-10 21:38
memberColin Eberhardt28-Apr-10 21:38 
Hi Alejandro,

Glad my comment was of interest to you!

I have not come across CacheSQL before, but have used Hibernate extensively (albeit a few years back). Yes, you are correct, Hibernate is an O/R mapper, so you must map your tables to objects in order to use the query language (although it does much more than just permit querying of database backed objects!). However, hibernate does not load all the data into memory then query the objects. The query, with its implicit join syntax, is typically translated directly into SQL and executed on the database.

I am not aware of any documentation that described how it parses the Hibernate Query Language (HQL). My knowledge of it comes from turning on the debugging in applications and watching the SQL queries executed on the database as a result of my HQL queries.

I like your idea of taking the simplest approach, applying the implicit join concept direct to SQL. And, no, this is not present in any form in the ANSI SQL standard.

Anyhow, interesting article.

Have fun,
Colin E.
GeneralHow deep is HIBERNATE ? Pin
Alejandro Xalabarder5-May-10 12:24
memberAlejandro Xalabarder5-May-10 12:24 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.170424.1 | Last Updated 4 May 2010
Article Copyright 2010 by Alejandro Xalabarder
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid