Click here to Skip to main content
Click here to Skip to main content

Understanding Table Joins using SQL

, 22 Jul 2013
Rate this:
Please Sign up or sign in to vote.
Joins are useful for bringing data together from different tables based on their database relations. First we will see how the join operates between tables. Then we will explore the Order of Execution when Join and where condition both exists. Finally we will move our exploration to the Importance o

1. Introduction 

Joins are useful for bringing data together from different tables based on their database relations. First we will see how the join operates between tables. Then we will explore the Order of Execution when Join and where condition both exists. Finally we will move our exploration to the Importance of the Join order.  

2. Run the attached script   

The downloaded script has the three table and table data for this article. You should also run the Northwnd script as some example here uses the NorthWnd DB. Once you downloaded the script CreateObject.zip run the script in the NorthWnd Db.   

Below is the three tables created by the Script: 

 

We are going to use these tables to perform the joins. These tables are just for demo purpose and So I do not have proper table relationship in terms of Primary and Foreign keys. OK, Let us move on.

 

 3. Cartesian Product of Table 

Usually join will be performed between two tables based on the key columns between two tables those together constitutes the database table relationship. For Example DeptId in the employee table and DeptId in the Department table make the relationship between these two tables. 

The below one is the example Joining the tables without using the key columns. Here, TableA and TableB are clubbed together to form the whole result set based on Cartesian Product. The Cartesian product will take a single record in the first table and attaches with all the records in the second table. Then takes the Second records in the first table and attaches with all the records the second table and this continue till the end of the records on the first table.

 

4. Joining Two tables   

When joining the two tables to avoid the bulk number of records that results as shown in the previous example, we should chose a join column from both the tables. The example given below joins Table_A and Table_B based on the Column called ID. Since the column in both the tables mapped based on the ID column, we will reduce huge records that are logically not useful and not related the ID map. 

Below is the Result of the Join: 

 

 Note that the Row Number 1 and Row number 5 are returned as the join result as they satisfies the mapping condition A.Id = B.Id. In the query it is shown in the Red Box.

 

5. Joining multiple tables   

The above example joint two tables. To join multiple tables, we should use the result of the previous join and pick a column from it, then pick a column in the new table then specify the join condition as the previous example. This way we can join multiple numbers of tables. Consider whatever joint so far as the single table and join it with the new one. 

 

First Table_A is joins with Table_B, which is nothing but the previous example. Then the joint result of A and B is considered as single table say AB. Then the AB is joint with the Table_C. This is shown in the below picture:

 

 

6. Join Types

There are three type of join available based the way we join columns on two different tables.

1) Full outer Join 

2) Inner Join

3) Left outer Join

4) Right outer Join 

What we saw in the previous two examples are the inner joins. If we join the same table we called it as Self join and it is special category do not get confuse it with the join types. Let us see an example for the join types in next coming examples.

Before we go into those examples, remember that the result computed so for is considered as LEFT and the new table coming to join the existing result is RIGHT. This is useful when we are joining multiple tables with different type of joins. 

 

7. Full Outer Join    

A full outer join is somewhat different from the Cartesian product. Cartesian product will get all the possible row combination between the two joining tables. But, Full outer join takes all the matching columns by join condition, all table rows from the left table that does not match the right side table, and, all tables rows in the right that does not match the left. It applies null for unmatched row on the other end (Left side Table column or Right side table column) when doing so. The below example shows the full outer join between Table_A and Table_C 

 

 

 

1. In the above picture the Blue Row the matching row on both the table.

2. Second row (Green First, red next) is the unmatched one. Row exists on the Left table and null substituted for all the columns in the Right.

3. Third row (Red First, Green next) is also the unmatched one. Row exists on the Right side table, null returned for the left one

Look at the from clause,

The Table_A is taken first and joint with Table_C. Here, The result set computed so for always treated as Left side of join and the new table going to be joint is treated as Right side of join. 

 

8. Left Join   

Left join makes sure to take all the rows on the left side of the table by placing the null entries for the table joining on the right side when there is unmatched row on the right side. 

 

In the above example, Id value of 2 in the Left table does not exist on the right side table Table_C.Id. But, we still got the 2,BBB row from the Table_A by placing the null entries for the right side table. This is shown in Green and red boxes above.

Also note that when SQL is processing, it takes the rows for the Table_A first (So Left) then joins it with the Table_C (Right  side). It does not matter whether we provide A.Id = C.Id or C.Id = A.Id 

 

9. Right Join  

It is the reverse of the left join. It implies take all the rows on the right side of table by placing the null on the left table for unmatched rows. Below is the example for it: 

  

Blue Box : Matched rows.

Green : Row exits on the right side table Table_B and match (Based on Id column) not available on the left

Red : Null placement for the columns 

 

10. Inner Join  

In inner join only the matched rows are retrieved. Please refer the section four. Inner join returns same result and hence one no need to worry about the placing a table on left or right. 

 

11. Self Join 

Joining the table with the same table is called the Self Join. To explain this let us go the table on the Northwnd database (Attached with this article). Have a look at the columns in the employee table. The EmployeeId column is the Primary key column and each row belongs to a single employee. The reports to column refer some other row in the same table stating that referred row is the manager for the referring row. But the referred row (manager) is also an employee possible having a valid entry on its ReportsTo column. So in the NorthWnd database this relationship achieves a hierarchical reporting structure. 

 

Now have look at the below example:

 

Here, the row pointed by ReportTo column is Manager. So the table on the left hand side is employee and table on the Right hand side is Manager. When the FirstName is referred on the left table of the joint result, it is Employee name and the same FirstName on the right table of the join result is Manager name. 

 

12. Execution Sequence   

When the query involves the combination of the outer and inner join the execution sequence is important. If you have only inner join the execution sequence is not much important as they are going to provide the same result. Well, What I am taking about? 

Let is Say you have a query, which has both inner join and outer join (Left or Right). Also let us assume you have where clause that filters the records based on column that is not a joining id column. The question comes which operation is performed first. We have two options:

1) Apply the where clause first because joining later will improve the performance then perform the join 

2) Apply the Join then perform the where clause 

The above two option returns same result when all the joins involved are inner joins. But the result may differ when we have at least one outer join. OK. SQL chose the second option because as per its operation Sequence From clause is completed first, then applies the where clause

Given below is an Example and the result:

 

How the Sequence differs is shown below:

Option 1:

 

 

Option 2:

 

So keep in mind the operation sequence as SQL first completes the join first then applies the where clause when the query has one or more outer joins.

 

13. Order of the Joins 

Like the Operation sequence the Order of the join also important when you want mix the inner joins with outer (Left or Right) joins. Again, if the entire join involved between the tables are inner joins the order of the join does not matter. But it matters when we mix the inner and outer joins. 

What is Order of the Join?  If my query joins three tables like [X inner Y] Left Z, the order here is inner join performed first, and then the left join.

OK. Let us go back to the NorthWnd Database again. The result you want to achieve is Get all the customers names whether they have order or not. Also list the Qty of order placed by the customer if they actually has some order.

Look at the Query and result below: [Outer Join then Inner Join] 

 

From the above query, you can see the order of join as mentioned below:

1) A right join between Orders and Customers. SQL first queries the Orders table (As it appears first) and treats the result as Left. Then it queries the Customers table next and treats the result set as Right. Finally from both the result set Right join is performed that means SQL ensures you that it will not lose any rows on the Right side result set that is it will not lose any rows from the Customers table. So you will get all customers including the two who don’t placed any orders and since a matching records for those two rows are not available you will get null columns for the Orders. Now the resulting join result is available for the next join and this join result is now treated as Left. 

2) Now the above returned result (Left side) is joint with the Order Details table. SQL knows it already has the Left result set so it query the table Order Details to have the Right. Finally an Inner join is performed between Left and Right based on the Order Id. But note that we have two null entries for the ordered column for which there are no corresponding customers in the Left side result. So the Inner join just skips those records. So we got a total of 2155 missing the two customers who does not place any orders, which is not the result we need. Read the Underlines text at the top of this section. 

 Now look at the Query and Result below: [Inner Join then Outer Join] 

Here, Inner join based on the OrderId between Orders and Order Details is performed first. This result (Left side) is then Right joint against the Cutomers table [Right].

Now let us analyze how this is giving the result we want.

The inner join between Order and Order Details brings all the matching records based on the order id. Note that we are not losing any order id here by null values. Then by keeping this already brought result on the left, Customers table is queried and kept in the Right. Then the right join is performed between Customers and Left side result based on the Customer Id. Now we get all the customers including the two for which we don’t have any matching records on the Left side table.

So…

Keep in mind that join order is important when you mix the inner join with outer join. 

 

14. Other way of achieving the same result 

When I had a talk with one of my office friend (VaraPrasad), he told that the result you are expecting could be achieved without using the Right Join. How? That is the question I asked him. He told that Crystal reports does it, I will show you. Good, now this section is added to this article based on what I got from him. 

 

 OK. Now let us see how this works and gives the expected result of not losing any customers. Note that the Rule remains same, whatever computed so for is Left and the Joining table is on the Right.

 

1) SQL first queries the table Customers and keeps it as the result on the Left.

2) It reads the Open parenthesis, and queries the table Orders and keeps it Left again. Why? SQL Says “Boss, I know that I am not going to join this table and Table I am going join is not ready yet. So I kept is Left side result.

3) Now the Order Details table is queried and kept as Right side of join as Left side is already available.

4) A join between Order and Order Details is performed based on the Order Id. The resultant records are treated as right because the Customer table is already queried and kept in the Left. Now thee left join between the Left and Right side of result set brings all customers as the join type left outer join.

Note: The scripts for creating the Demo tables and NorthWnd database is available as download. 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Sivaraman Dhamodharan
Software Developer iSOFT
India India
I am working as software engineer in iSOFT R&D. I have been come accross C++,MFC, .net technologies. I do like playing video games, reading books.
 

Web: www.mstecharticles.com


Comments and Discussions

 
GeneralMy vote of 2 PinmemberJasmine250113-Aug-12 9:14 
GeneralRe: My vote of 2 Pinmembersirama200415-Aug-12 16:50 
GeneralRe: My vote of 2 PinmemberJasmine250116-Aug-12 5:10 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web01 | 2.8.140721.1 | Last Updated 22 Jul 2013
Article Copyright 2012 by Sivaraman Dhamodharan
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid