Click here to Skip to main content
Click here to Skip to main content

Difference between And clause along with on and Where clause when used with left join in SQL Server

By , 11 Mar 2013
 

Introduction

The purpose of this article is to show the difference between And clause with On and Where clause when using with left join in SQL Server.

Background

I have been working on SQL Server databases for more than 5 years and I was not aware of the difference between the On clause and Where clause when used with left join. Once I asked this question to our DBA (who has 10+ years of experience) and he said that there is some difference but he was not be able to explain the difference. Then I started exploring this topic myself and got some interesting observations of these two clauses when used with left join. So today I decided to share my knowledge so that everyone could benefit from my work.

Using the code

Today I will explain the difference between On clause and Where clause when used with left join in SQL Server. When the On clause is used in an outer join, the outer table will have no effect on this On clause and all rows from the outer table will be returned and the On clause determines which rows of the subordinate table joins to the outer table. Rows of the outer table that do not meet the condition specified in the On clause in the join are extended with null values for subordinate columns (columns of the subordinate table), whereas the Where clause filters the rows that actually were returned to the final output.

It’s difficult to absorb this from the above definition, so let’s try to understand this difference with the help of  an example. Suppose we have two tables Departments (deptId, deptName) and Employees (empID, DeptID, EmpName, Salary) and deptiD is the foreign key of the department table. An employee can have only one department where as a department can have many employees.

We have this sample data in the Department table:

DeptId      DeptName
----------- ---------------
1           HR
2           Payroll
3           Admin
4           Marketing
5           HR & Accounts

(5 rows)

And here is the sample data for the Employees table:

EmpId       EmpName         DeptId      EmpSalary
----------- --------------- ----------- ---------------------
1           John            1           5000.00
2           Albert          1           4500.00
3           Crain           2           6000.00
4           Micheal         2           5000.00
5           David           NULL        34.00
6           Kelly           NULL        457.00
7           Rudy            1           879.00
8           Smith           2           7878.00
9           Karsen          5           878.00
10          Stringer        5           345.00
11          Cheryl          NULL        NULL

(11 rows)

Case 1

Suppose we are asked to show all the employees and their relevant departments, then we would write a query like this:

select * from employees e 
<p>left join departments d on e.deptid = d.deptid</p>

                    (query1)

and the result of this query is:

EmpId       EmpName         DeptId      EmpSalary             DeptId      DeptName
----------- --------------- ----------- --------------------- ----------- ---------------
1           John            1           5000.00               1           HR
2           Albert          1           4500.00               1           HR
3           Crain           2           6000.00               2           Payroll
4           Micheal         2           5000.00               2           Payroll
5           David           NULL        34.00                 NULL        NULL
6           Kelly           NULL        457.00                NULL        NULL
7           Rudy            1           879.00                1           HR
8           Smith           2           7878.00               2           Payroll
9           Karsen          5           878.00                5           HR & Accounts
10          Stringer        5           345.00                5           HR & Accounts
11          Cheryl          NULL        NULL                  NULL        NULL

(11 rows)
                       (Table 1)

Here you can see that all rows from the Employees table are listed along with their department, if they have a department; otherwise, deptid and deptname are null.

Case 2

Let’s say we are asked to show the list of all employees and the department name of only those employees who have "HR" or "HR & Accounts" department; then we would write a query like this:

select * from employees e 
left join departments d on e.deptid = d.deptid 
     and ( d.deptname = 'HR' or d.deptname  = 'HR & Accounts')
(query 2)

and the result of this query is:

EmpId       EmpName         DeptId      EmpSalary             DeptId      DeptName
----------- --------------- ----------- --------------------- ----------- ---------------
1           John            1           5000.00               1           HR
2           Albert          1           4500.00               1           HR
3           Crain           2           6000.00               NULL        NULL
4           Micheal         2           5000.00               NULL        NULL
5           David           NULL        34.00                 NULL        NULL
6           Kelly           NULL        457.00                NULL        NULL
7           Rudy            1           879.00                1           HR
8           Smith           2           7878.00               NULL        NULL
9           Karsen          5           878.00                5           HR & Accounts
10          Stringer        5           345.00                5           HR & Accounts
11          Cheryl          NULL        NULL                  NULL        NULL

(11 rows)
                           Table (2)

You can notice here that only the same number of rows are returned as in query 1, but here the department of only those employees are listed whose department name is "HR" or "HR & Accounts". As you can see, "Crain"," Micheal", and "Smith" have department "Payroll" (see table 1), but it’s not listed in the above result set because here we applied the filter in the "On clause"; hence only those rows become part of the join that has department name "HR" or "HR & Employee", and as a result, all other employees who don’t have department name "HR" and "HR & Accoutns" show their department name as null. This is an example of the On clause in left join.

Case 3

What if we are asked to show only those employees who have their department name "HR" or "HR & Accounts"? We would write a query like this:

select * from employees e 
left join departments d on e.deptid = d.deptid 
where ( d.deptname = 'HR' or d.deptname  = 'HR & Accounts')                    
        (query 3)

and the result of this query is:

EmpId       EmpName         DeptId      EmpSalary             DeptId      DeptName
----------- --------------- ----------- --------------------- ----------- ---------------
1           John            1           5000.00               1           HR
2           Albert          1           4500.00               1           HR
7           Rudy            1           879.00                1           HR
9           Karsen          5           878.00                5           HR & Accounts
10          Stringer        5           345.00                5           HR & Accounts

(5 rows)
                                      Table (3)

You can see here that only those employees are listed who have their department "HR" or "HR & Accounts", so what we did here is the same query as query 1. You can check the result of query 1 in table 1, and then we just applied a filter in the Where clause to filter the rows and to return only those employees who have their department name as "HR" or "HR & Accounts". This is an example of Where clause in left join.

Now you have seen all three cases and can see how these differ from each other. This was an example of a (1:1) one to one table as one employee can have only one department. Let's execute the above 3 cases with (1:M), a one to many table relationship where one department can have multiple employees (or more than one employee can be in one department).

Case 4

Let’s suppose our requirement is to get all departments and the employees listed in these departments, then we would write a query like:

select * from departments d
left join employees e on e.deptId = d.deptId
        (query 4)

It will return all rows from the Department table and only those rows from the Employees table that have a department. The result will look like:

DeptId      DeptName        EmpId       EmpName         DeptId      EmpSalary
----------- --------------- ----------- --------------- ----------- ---------------------
1           HR              1           John            1           5000.00
1           HR              2           Albert          1           4500.00
1           HR              7           Rudy            1           879.00
2           Payroll         3           Crain           2           6000.00
2           Payroll         4           Micheal         2           5000.00
2           Payroll         8           Smith           2           7878.00
3           Admin           NULL        NULL            NULL        NULL
4           Marketing       NULL        NULL            NULL        NULL
5           HR & Accounts   9           Karsen          5           878.00
5           HR & Accounts   10          Stringer        5           345.00
(10 rows)
                                (Table 4)

You can see in the result that all departments are listed, even those that don’t have any employees, as "Admin" and "Marketing".

Case 5

But what if we are asked to show all departments and only those employees who have department name "HR or "HR & Accounting". Then we would write a query like:

select * from departments d
left join employees e on e.deptId = d.deptId and 
     ( d.deptname = 'HR' or d.deptname  = 'HR & Accounts')
        (query 5)

It will show the following records:

DeptId      DeptName        EmpId       EmpName         DeptId      EmpSalary
----------- --------------- ----------- --------------- ----------- ---------------------
1           HR              1           John            1           5000.00
1           HR              2           Albert          1           4500.00
1           HR              7           Rudy            1           879.00
2           Payroll         NULL        NULL            NULL        NULL
3           Admin           NULL        NULL            NULL        NULL
4           Marketing       NULL        NULL            NULL        NULL
5           HR & Accounts   9           Karsen          5           878.00
5           HR & Accounts   10          Stringer        5           345.00

(8 rows)

                               Table (5)

You can see in the result, this is the On clause in left join; it just return all rows from the Department table and only those rows from the Employee table that have department "HR" or "HR & Accounts". We have employees in the Payroll department (Table 4 as a result of query 4) but it’s not listed in the above result set as we just put an extra condition in the "On clause" that the department name should be "HR" or "HR & Accounts". That’s why only those rows were part of this join which have department name "HR" and "HR & Accounts". Hence all rows from the Department table were returned and only those matching rows were returned from the Employees table who have their department as "HR" or "HR & Accounts".

Case 6

What if we are asked to show only the "HR" and "HR & Accounts" departments along with their relevant employees? We would then write a query like this:

select * from departments d
left join employees e on e.deptId = d.deptId 
where ( d.deptname = 'HR' or d.deptname  = 'HR & Accounts')
        (query 6)

and the result of the above query would be:

DeptId      DeptName        EmpId       EmpName         DeptId      EmpSalary
----------- --------------- ----------- --------------- ----------- ---------------------
1           HR              1           John            1           5000.00
1           HR              2           Albert          1           4500.00
1           HR              7           Rudy            1           879.00
5           HR & Accounts   9           Karsen          5           878.00
5           HR & Accounts   10          Stringer        5           345.00

(5 rows)

                                (Table 6)

So what happened here? We just did a left join as we did in query 4, and you can see the result in table 4. And then we apply a filter on the result set to return only those rows that have deptNmae as "HR" or "HR&Accounts". This is a Where clause in left join. Now you can see how it is different from the result of query 5 where we put this condition in the On clause.

There is no difference between the Where clause and On clause when used with inner join.

I have shown you six different cases to explain the concept. I hope you enjoyed reading the article. I have attached a script that can be used to create the tables and sample data and then you can practice all by yourself. I would be waiting for your feedback.

References

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Hafiz Muhammad Suleman
Software Developer (Senior)
Pakistan Pakistan
Member
I am currently working as Senior Software Engineer and I have more than 6 years of experience in Database and developing web application .

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralMy vote of 5memberTridip Bhattacharjee14 Apr '13 - 7:01 
good...thanks
GeneralMy vote of 3memberprahalad.gaggar10 Apr '13 - 19:01 
A good article
QuestionCommentsmemberKP Lee21 Mar '13 - 21:45 
Case 3 isn't a good example for an outer join, instead:
select * from employees e
join departments d on e.deptid = d.deptid and d.deptname in ('HR', 'HR & Accounts')
 
would do the same thing MUCH more efficiently. Recognizing when inner and outer joins should be used should be one of your basic skills tool. Twisting things around to keep the outer join going where in inner join would do, is a poor practice.
 
One of the best things for using the where clause on outer joins is finding mismatches and you never covered it:
select d.* from employees e
right join departments d on e.deptid = d.deptid
where e.deptid is null
 
Yeah, I know, all your examples were left joins. Sue me, I'm lazy.
 
The following shows the mismatches in both tables:
select * from employees e
full join departments d on e.deptid = d.deptid
where e.deptid is null or d.deptid is null
 
There are other differences between inner and outer joins. I have had significant performance improvements in queries by moving queries from the where clause to the on clause on inner joins because the on clause is applied at the point of joining. This is because the on clause can work on a few thousand records while the where clause is working on millions of additional rows that matched records that should have initially been rejected and never joined with additional tables that provided the millions of unneeded rows. There is a BIG difference in performance on inner joins on where you put your clauses. For instance you have a table with 10 rows joined with a table with 10K rows, that joins with a table with 10M rows. If your on clause limits the 10 row table to accept 2 rows, because it is indexed the 10K table only looks up 2K records and never touches the other 8K records. Instead of touching 10M rows and rejecting 8M, it only looks at 2M.
 
I've been told that I'm wrong, that the execution plan is smart enough to apply the where clause at the appropriate time instead of at the end. If that was true, I don't understand how my query finishes in 8 seconds and then the original finishes in 25 minutes. Yes, everything I've read also says the where query goes directly to the index field to make selections, but I don't see it in real life when multiple joins are involved.
GeneralMy vote of 5memberabhayojha18 Mar '13 - 8:04 
Hi Hafiz,
It's really great article. I enjoyed reading it. Keep it dude.
GeneralQuery performance between on and where ?memberginoway18 Mar '13 - 5:16 
Interested tips .. but still be aware of the performance, could you please take a more looking in this area Poke tongue | ;-P
GeneralRe: Query performance between on and where ?memberKP Lee21 Mar '13 - 21:48 
Just put in "Comments" reply that addresses some performance differences between the two. It just touches on the differences.
QuestionDifference between on and Where clause when used with left joinmemberMember 989789413 Mar '13 - 21:52 
Sorry, but for me it is surprising that this is something new for somebody working 5 years with a SQL Server. And it is unbelievable that somebody working 10+ years with SQL Server can't explain it.
QuestionTheir IS a big difference - speedmemberGéza Mikló13 Mar '13 - 21:47 
Hi,
 
I think you missed something really important. You are right that their is no difference regarding the results.
 
Let's say, your employee table has millions of records. If we want to select all the employees working at HR or HR & Account and have salary above 1000. Following your logic you do it with query 3 instead of combining it with query 2.
 
The best solution will be:
select * from employees e
left join departments d on e.deptid = d.deptid
     and ( d.deptname = 'HR' or d.deptname  = 'HR & Accounts') AND EmpSalary > 1000
where ( d.deptname = 'HR' or d.deptname  = 'HR & Accounts') AND EmpSalary > 1000
 
This way you will filter the results during the join operation. So WHERE condition will filter a smaller amount of records.
 
So they are definitely different.
GeneralMy vote of 5memberAzziet12 Mar '13 - 20:04 
perfect one
Questionnice articlemembereriksoft8412 Mar '13 - 4:37 
very good and very easy Big Grin | :-D Big Grin | :-D Big Grin | :-D
GeneralMy vote of 5memberkevininstructor12 Mar '13 - 3:33 
Nicely done
QuestionNice artilclemembermehul198811 Mar '13 - 21:20 
Nice artilcle
GeneralMy vote of 5membergallysundar11 Mar '13 - 19:28 
Thanks for the clear examples..
GeneralMy vote of 5memberDavid Pierson10 Mar '13 - 19:49 
Thank you! This kind of came up at work a while back and I did not really understand it.
 
Great examples - very clear.
GeneralMy vote of 5memberAlihan Alkaya7 Mar '13 - 21:05 
Very clear. Thank you.
GeneralMy vote of 5memberupendra kumar pandey7 Mar '13 - 17:59 
its very helpfull
GeneralMy Vote is 5.memberJayanta Chatterjee6 Mar '13 - 3:29 
This Article clear the concept of left Join with "and clause" and "on clause"...
Thank You...
QuestionMy Vote is 2memberP.Venkatachalam4 Mar '13 - 17:49 
Good Articles, but if you add how the data will be compared internally with query execution time then then the article is superb.. Thanks
GeneralMy vote of 3memberdojohansen25 Feb '13 - 22:38 
Outer join is not an "advanced" concept, your DBA is incompetent. I think it's strange to speak only of the left join and wonder when the right and full join might be discovered - never mind the cross. Smile | :)
 
That said, the presentation is very nice and orderly, and the fact several people rated it highly suggests this needed explaining. If not for this, a 2 would have been in store from me.
GeneralRe: My vote of 3memberKP Lee21 Mar '13 - 22:17 
dojohansen wrote:
your DBA is incompetent
I agree, unfortunately, it is also all too common. I'm a DB developer and I've been thrown into the DBA role. I've been asked why log shipping is so slow. To myself, I'm going "I have no idea. Maybe some kind of interference?", and look at it anyway. "Well, it might help if you don't post your logs on the same drive the DB is using." And I'm a genius dba, when I really was lucky to spot that so fast.
GeneralMy vote of 2memberOrvindo23 Feb '13 - 13:01 
This was very basic. Check http://stackoverflow.com
GeneralMy vote of 5memberMember 969271919 Feb '13 - 10:16 
It was never clear to me until now and I've been doing it for years too. And great examples.
GeneralMy vote of 5memberDileep Mada15 Feb '13 - 1:47 
Very Good Article
QuestionDifference between On clause and Where clause when used with left join in SQL Servermemberdarshansh14 Feb '13 - 19:35 
Good post.
 
This is what the main difference between 'And' and 'where' usgae in left/Right join.
QuestionThanks, Vote 5memberZaw Myo Tun14 Feb '13 - 16:02 
I had been wondering this different for quite long, but I did not make an analysis like you do. Now I am clearly understand. Thank you so much.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web02 | 2.6.130523.1 | Last Updated 12 Mar 2013
Article Copyright 2011 by Hafiz Muhammad Suleman
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid