Click here to Skip to main content
Click here to Skip to main content

SQL Wizardry Part One - Joins

By , 23 Jan 2014
Rate this:
Please Sign up or sign in to vote.

Introduction

Welcome to this, my first article in many years. I am intending on doing a series of articles on SQL, one a week. I know I am probably not writing anything that’s not already on the site, but my goal is to provide a series of articles that cover things in an organised way, almost like a book that people can work through. I suspect a lot of developers work with databases, and know a little bit of SQL, but really don’t understand how to really get the most out of it. Hopefully I can help change that.

Background

This first article is going to start with basics. I assume you know how to write a ‘where’ clause, and I’m going to start by talking about the different ways of connecting tables to select data between them. If you didn’t download the code for this article, please do so now. It contains all the SQL you’ll want to run in Management Studio while reading through this. Download it now. I’ll wait…. OK, back ? The first thing you’ll see is code that looks like this:
with listall(id) as

(

  select 1 union all

  select 2 union all

  select 3 union all

  select 4 union all

  select 5 union all

  select 6 union all

  select 7 union all

  select 8 union all

  select 9 union all

  select 10

),

listodd(id) as

(

  select 1 union all

  select 3 union all

  select 5 union all

  select 7 union all

  select 9 union all

  select 11

),

listeven(id) as

(

  select 2 union all

  select 4 union all

  select 6 union all

  select 8 union all

  select 10

)
Don’t worry if you have no idea what this is ( these are Common Table Expressions, the subject of the second article ). Suffice it to say, this creates three in memory tables, one of the sequence 1-10, one of only odd numbers to 11 and one of only even, to 10.

INNER JOIN

I assume that everyone knows what this does:
select * from listall
It returns the contents of the listall table ( 1-10). Now, what does this do:
select * from listall la inner join listodd lo on lo.id = la.id
It returns the odd sequence to 9, because an inner join will return only rowsets where the two tables can be connected. This is useful where you have, for example, an order table with a client id, and a client table where that id gives you all the client’s details. Now, what does this do ?
select * from listeven le inner join listodd lo on lo.id = le.id
It returns nothing, because the odd and even rows have no rows in common.

OUTER JOIN

Now, what does this do ?
select * from listall lo left outer join listeven le on lo.id = le.id
It returns all our rows from the table on the LEFT of the select, even if the join does not match. Where the join does not match, a null is returned. This can be useful where you are using the join to work out if something exists, you can use a case statement to set a value according to if a match was found, like so:
select lo.id, isEven = case when le.id is null then 0 else 1 end from listall lo left outer join listeven le on lo.id = le.id
The other thing you can do is use COALESCE or ISNULL to set a default value, like so:
select lo.id, isnull(le.id, -1) from listall la left outer join listeven le on la.id = le.id
As you can see, you can do a lot more than just return raw data, in SQL. Now, what does this do:
select * from listeven le left outer join listall la on la.id = le.id
It returns only the even numbers. The join is on the LEFT, and the left most table only contains even numbers. To get the same result, if you need this table order, you do this:
select * from listeven le right outer join listall la on la.id = le.id
If you need nulls from both sides, you can do this:
select * from listeven le full outer join listodd lo on lo.id = le.id
Note if you try to order this using orderby, you will find that nulls get ordered first. To get this in order again, you’d do this:
select coalesce(le.id, lo.id) as idList, le.id, lo.id from listeven le full outer join listodd lo on lo.id = le.id order by idlist
As you may be aware, coalesce, like isnull, will return the second value if the first one is null. The difference is that coalesce ONLY does this, and ISNULL can evaluate an expression as the first argument. One thing to add, the word ‘OUTER’ is optional, you can just specify ‘left join’, ‘right join’ or ‘full join’.

CROSS JOIN

This is probably the one people know the least. What do you expect this to do ?
select * from listeven le cross join listodd lo
If you run it, you will get 30 results. There are 5 numbers in listeven and 6 in listodd. 5 times 6 is 30 results. Cross join returns every row in table 1, combined with every row in table 2. This is a very powerful technique, when used correctly. There will likely be an article down the track on cool things you can do with cross joins, and I am certain that cross joins will feature in next week’s article. As you can see, because it just returns EVERY possible combination, there is no column to join on, in fact, if the two tables have columns in common, both values are returned, and you will have two columns with the same names, unless you alias them like this:
select le.id, lo.id as id2 from listeven le cross join listodd lo
You are not obliged to return ALL the data cross joined, you can specify columns and alias them, just like any other select. A short hand for a cross join, is a comma, as in:
select * from listeven, listodd 

However, while you may see this used in online examples, it has not been 'valid' SQL for over 20 years, and no database implementation is required to support it, so you should not use it in your own code.

INTERSECT 

Although they are not really joins, I am going to cover two other things here. The first is ‘intersect’. It works like this:
select * from listall

intersect

select * from listodd
In this case, listodd has one value not in listall ( 11), so the result is the contents of listodd, minus the 11. Sometimes you will have two result sets, and you will want to know what values are in common between them. INTERSECT allows you to do this. Of course, this can be done by intersecting two select statements that contain as many joins and other operations as you wish. Select * is acceptable syntax, but it’s required that both tables have the same column names and data types. Be careful, sometimes you will only care about an intersection of one column ( usually a primary key ), in which case, it probably makes sense to have CTEs that return the primary key values only, and one that returns all the data, so you can end up with code like this ( this one is not in the file, as it won’t run )
select * from myClientList where clientid in ( select clientid from list1 intersect select clientid from list2)
This would of course be at the bottom of statements defining three possibly complex CTEs. Hopefully you can see how CTEs increase readability and maintainability of SQL code.

EXCEPT

I recently had the task of transforming a large data set from one database/format to another. We had a ton of complex business rules, so I ended up with a view that grabbed all the data from the old DB, a lot of views to process different data types, then a view at the top to aggregate all those values, and turn them in to one list again. It was very complex, but at the end, we wanted to make sure we had not missed any records. This is exactly the sort of situation where the ‘EXCEPT’ operation is invaluable.
select * from listall

except

select * from listodd
This returns everything in the first table that is not in the second ( but NOT vice versa ). So, while the number ‘11’ is in listodd and not in listall, it is not returned by this operation. In my case, this returned all of the documents in my base list, that were not already in the top level list, and which had therefore not fulfilled any business rules and would otherwise have been lost. Given the nature of our data transformations, my query was a little more complex ( it was a union of three selects ), and I had to do an EXCEPT on my primary key only, then select all my data on the basis of that operation. If ANY row differs, the data will be returned as being different, so if you want to find only what records are not there ( regardless of if they have been transformed in any way ), then an operation on just the key is what you need.

UNION and UNION ALL

I nearly forgot to cover these. What if, given our data set, we wanted a sequence from 1 to 11 ? This will do that:
select * from listall
union
select * from listodd
order by id
It returns the combination of values from both tables, removing duplicates. If you want to see all values, including duplicates, do this:
select * from listall
union all
select * from listodd
order by id

I have seen some truly awful SQL code based on unions. Remember, union means, run these separate SQL queries, then merge them. If you can instead create one query, it’s bound to be faster ( but UNION itself is VERY fast, what I mean is, if you run two queries that are complex, you pay the cost of both queries, instead of just running one ). However, UNION was invaluable in the task I was describing earlier, because I had a base view, and different views for different business rules, which then needed to be pulled together again. It WAS slow, but it was the best possible tool for the task at hand.

Several people suggested I add a visual representation of the different join types. There is already an excellent article on CP that does this, you can see it here. I see no reason to duplicate it, when it is better for people seeking more information to read that article, as well as mine. I will add, the 'excluding joins' the author shows are perfectly valid, for the purposes of the techniques I've shown, I'd use the 'EXCEPT' keyword to get the same result. I am not sure which is faster.

As you can see, there’s a lot more to SQL than just the select statement and inner joins. Next week I’ll set out to explain how to get the most out of CTEs. If you have any thing you’d like me to specifically cover, or even any general SQL questions, please ask. I promise I’ll answer every question, even if I sometimes have to say that I’m not sure ( but I’ll try to find out ).

 

Part two ( which covers SELECT statements ) can now be read here.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Christian Graus
Software Developer (Senior)
Australia Australia
Programming computers ( self taught ) since about 1984 when I bought my first Apple ][. Was working on a GUI library to interface Win32 to Python, and writing graphics filters in my spare time, and then building n-tiered apps using asp, atl and asp.net in my job at Dytech. After 4 years there, I've started working from home, at first for Code Project and now for a vet telemedicine company. I owned part of a company that sells client education software in the vet market, but we sold that and I worked for the owners for five years before leaving to get away from the travel, and spend more time with my family. I now work for a company here in Hobart, doing all sorts of Microsoft based stuff in C++ and C#, with a lot of T-SQL in the mix.

Comments and Discussions

 
QuestionConfused about INTERSECT and EXCEPT PinmemberJennifer McCarthy17-Mar-14 7:13 
AnswerRe: Confused about INTERSECT and EXCEPT PinprotectorChristian Graus17-Mar-14 9:12 
QuestionA change in your wording PinmemberTbone Soprano27-Jan-14 5:46 
Questionbrilliant! PinmemberMember 1012397224-Jan-14 7:06 
GeneralA word about UNION PinmemberMalte Klena23-Jan-14 1:53 
GeneralRe: A word about UNION PinprotectorChristian Graus23-Jan-14 7:36 
Questionvery nice PinprofessionalBillW3316-Jan-14 10:40 
AnswerRe: very nice PinprotectorChristian Graus16-Jan-14 10:49 
QuestionBasic sql PinmemberDr Bob15-Jan-14 1:58 
QuestionNice reading Pinmemberkmoorevs9-Jan-14 12:02 
AnswerRe: Nice reading PinprotectorChristian Graus16-Jan-14 10:49 
QuestionPedantic Point Pinprofessional_Maxxx_7-Jan-14 15:41 
AnswerRe: Pedantic Point PinprotectorChristian Graus13-Jan-14 10:50 
GeneralRe: Pedantic Point Pinprofessional_Maxxx_13-Jan-14 13:11 
GeneralRe: Pedantic Point PinprotectorChristian Graus13-Jan-14 13:34 
GeneralRe: Pedantic Point PinprofessionalJörgen Andersson13-Jan-14 21:04 
GeneralRe: Pedantic Point PinprotectorChristian Graus16-Jan-14 11:40 
GeneralRe: Pedantic Point PinprofessionalJörgen Andersson16-Jan-14 20:29 
GeneralRe: Pedantic Point PinmemberPIEBALDconsult18-Jan-14 2:41 
AnswerRe: Pedantic Point Pinmemberdevenv.exe14-Jan-14 21:20 
Questionsmall bug in code PinmemberMember 101302647-Jan-14 7:01 
AnswerRe: small bug in code PinprotectorChristian Graus13-Jan-14 10:51 
SuggestionThis shouldn't be a recommendation .... PinmemberMarc Scheuner5-Jan-14 23:14 
GeneralRe: This shouldn't be a recommendation .... PinprotectorChristian Graus6-Jan-14 10:58 
GeneralThoughts PinprofessionalPIEBALDconsult3-Jan-14 17:27 
"the task I was describing area"
 
Do you mean "earlier"?
 

I'd like to see information on which parts are standard SQL and which parts are T-SQL-specific.
 

And regarding UNION ALL -- I usually use UNION ALL when I know that there are no duplicates. I have no numbers to back it up, but I feel that it saves the system from wasting its time looking for duplicates.
GeneralRe: Thoughts PinprotectorChristian Graus3-Jan-14 18:06 
GeneralRe: Thoughts PinprofessionalJörgen Andersson13-Jan-14 21:32 
GeneralRe: Thoughts PinprotectorChristian Graus16-Jan-14 11:38 
GeneralGreat article thank you! PinmemberJason Gayda2-Jan-14 10:37 
QuestionGreat article PinmvpCPallini2-Jan-14 10:37 
AnswerRe: Great article PinmemberChristian Graus2-Jan-14 10:46 
GeneralRe: Great article PinmvpCPallini2-Jan-14 20:48 
QuestionThanks, Good work PinprofessionalKen of Kendoria23-Dec-13 13:38 
AnswerRe: Thanks, Good work PinmvpChristian Graus23-Dec-13 13:39 
QuestionNice Article PinprofessionalSimon_Whale22-Dec-13 22:30 
AnswerRe: Nice Article PinmvpChristian Graus22-Dec-13 22:37 
GeneralRe: Nice Article PinprofessionalSimon_Whale22-Dec-13 22:39 
GeneralRe: Nice Article PinmvpChristian Graus22-Dec-13 22:40 
GeneralRe: Nice Article PinprofessionalSimon_Whale22-Dec-13 23:05 
GeneralRe: Nice Article PinmvpChristian Graus22-Dec-13 23:06 
SuggestionSuggestion PinmvpMaciej Los22-Dec-13 11:41 
GeneralRe: Suggestion PinmvpChristian Graus22-Dec-13 11:45 
GeneralRe: Suggestion PinmvpMaciej Los22-Dec-13 20:06 
GeneralRe: Suggestion PinmvpChristian Graus23-Dec-13 11:01 
GeneralRe: Suggestion PinmvpMaciej Los23-Dec-13 11:39 
GeneralRe: Suggestion PinmvpChristian Graus23-Dec-13 11:41 
GeneralRe: Suggestion PinmvpMaciej Los23-Dec-13 11:47 
QuestionAnother Suggestion... Pinmemberclawton21-Dec-13 18:01 
AnswerRe: Another Suggestion... PinmvpChristian Graus21-Dec-13 20:46 
QuestionSuggestion... PinmemberSteve Wellens21-Dec-13 16:05 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web04 | 2.8.140421.2 | Last Updated 23 Jan 2014
Article Copyright 2013 by Christian Graus
Everything else Copyright © CodeProject, 1999-2014
Terms of Use
Layout: fixed | fluid