Click here to Skip to main content
15,884,298 members
Articles / Database Development

Rewrite Subquery as a Join

Rate me:
Please Sign up or sign in to vote.
4.00/5 (1 vote)
7 Mar 2022CPOL3 min read 3.4K   3  
How to rewrite a subquery as a join
This post discusses several types of subqueries, and how each can be switched to another form, such as a join.

In this puzzle, we’re going to learn how to rewrite a subquery using inner joins. Knowing about a subquery versus inner join can help you with interview questions and performance issues. Though subqueries have unique abilities, there are times when it is better to use other SQL constructs such as joins.

By reading this article, you’ll learn about several types of subqueries, and how each can be switched to another form, such as a join.

Solving puzzles is a great way to learn SQL. Nothing beats practicing what you’ve learned. Once you have figured out the puzzle, post your answer in the comments so we all can learn from one another.

SQL Puzzle Question

A remedy for query confusion…

A coworker just learned about subqueries and wrote some SQL to retrieve employee names and birthdates from the AdventureWorks database. The problem is, they want to change it and now it’s hard to read!

Can you help them simplify the following SQL?

SQL
SELECT   E.HireDate,
         (SELECT FirstName
          FROM   Person.Person P1
          WHERE  P1.BusinessEntityID = E.BusinessEntityID),
         (SELECT LastName
          FROM   Person.Person P2
          WHERE  P2.BusinessEntityID = E.BusinessEntityID),
         E.BirthDate
FROM     HumanResources.Employee E
WHERE    (SELECT PersonType
          FROM   Person.Person T
          WHERE  T.BusinessEntityID = E.BusinessEntityID) = 'EM'
ORDER BY HireDate,
         (SELECT FirstName
          FROM   Person.Person P1
          WHERE  P1.BusinessEntityID = E.BusinessEntityID)

What statement would you write to make it easier to read, and perhaps run more efficiently?

Subquery versus Inner Join Answer

Before we begin, let’s talk about the existing query… what is it?

You’ll see the query combines data from two different tables. It is doing so using subqueries in both the FROM and WHERE clauses.

SQL
SELECT   E.HireDate,
         (SELECT FirstName
          FROM   Person.Person P1
          WHERE  P1.BusinessEntityID = E.BusinessEntityID)</span>,
         (SELECT LastName
          FROM   Person.Person P2
          WHERE  P2.BusinessEntityID = E.BusinessEntityID)</span>,
         E.BirthDate
FROM     HumanResources.Employee E
WHERE    (SELECT PersonType
          FROM   Person.Person T
          WHERE  T.BusinessEntityID = E.BusinessEntityID)</span> = 'EM'
ORDER BY HireDate,
         (SELECT FirstName
          FROM   Person.Person P1
          WHERE  P1.BusinessEntityID = E.BusinessEntityID)</span>

Also, you see that each subquery’s WHERE clause is restricting the rows returned to those equal to the Employee.BusinessEntityID. This is what people call a correlated subquery.

I also want to point out that the queries in the FROM must return a single value (scalar). If they don’t, then an error is thrown.

As you can imagine, this is dangerous, as it can be difficult to guarantee a query return at most one row. I know were safe in this case I know this as the matching condition occurs between each table’s primary keys.

Subquery versus Inner Join – Converting the Query

If I was writing this query, I would use an INNER JOIN. Here is the query I would write:

SQL
SELECT   E.HireDate,
         P.FirstName,
         P.LastName,
         E.BirthDate
FROM     HumanResources.Employee E
         INNER JOIN Person.Person P
         ON P.BusinessEntityID = E.BusinessEntityID
WHERE    P.PersonType  = 'EM'
ORDER BY E.HireDate,
         P.FirstName

Which is Easier to Read?

This shouldn’t be much of a debate, the INNER JOIN is much shorter, and I think to the point. The join clause speaks for itself. You know it is relating two tables together; whereas, with the subquery, it is so apparent.

The INNER JOIN version is easier to maintain.

Also, with the subquery method, you’ll see much of the code repeated. This may not seem like a big deal, now, but if you ever have to change the query, it is, as now when you make a change you need to be sure to make the same change in several locations.

SubQuery or Inner Join? Which is More Efficient?

Here is the query plan for the sub query version:

Subquery versus Inner Join Query Plan

I’ve highlighted the effect of the four subqueries. For this statement, each query results in a nested loop. These aren’t good.

It means if you have ten rows in two tables each, then you need to on average, iterate through the second table 50 times (100/2) for each row in the first to find a match. This means, instead of a seek taking two or three operations to find a match, the match could take upward of 100 * 50 = 500 seeks to find.

Nested loops are a fact of life, but less is better.

And here is the version for the INNER JOIN:

SQL
SET SHOWPLAN_ALL ON
SELECT   E.HireDate,
         P.FirstName,
         P.LastName,
         E.BirthDate
FROM     HumanResources.Employee E
         INNER JOIN Person.Person P
         ON P.BusinessEntityID = E.BusinessEntityID
WHERE    P.PersonType  = 'EM'
ORDER BY E.HireDate,
         P.FirstName

Image 2

Here, you see there is only one nested loop. For sure, that is better than four.

After observing both the SQL and query plans for each set of statements, you can see that INNER JOIN is superior in several ways; however, check out that simplified plan!

The query’s true task is to combine columns from two tables; this is what INNER JOINS excel at. Sure, there are time subqueries make sense, and can be used to do things you can’t with joins, but in this case, it doesn’t make sense to use one.

This article was originally posted at https://www.essentialsql.com/rewrite-subquery-as-join

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Easy Computer Academy, LLC
United States United States
Hello my name is Kris. I’m here because I am passionate about helping non-techie people to overcome their fear of learning SQL.

I know what it is like to not know where to start or whether the time spent learning is worth the effort. That is why I am here to help you to:
- Get started in an easy to follow step-by-step manner.
- Use your time wisely so you focus on what is important to learn to get the most value from your time.
- Answer your questions. Really! Just post a comment and I’ll respond. I’m here to help.

It wasn’t long ago that I was helping a colleague with some reporting. She didn’t know where to start and soon got overwhelmed and lost as she didn’t know SQL.

I felt really bad, as she was under pressure to get some summary information to her boss, the built-in reports were falling short, and to make them better would require her to know SQL. At that time that seemed impossible! It in dawned on me, it doesn’t have to be that way.

Then I discovered a way for anyone with the desire to easily learn SQL. I worked with my co-worker, started to teach her what I learned and soon she was able to write reports and answer her boss’ questions without getting stressed or ploughing hours into manipulating data in Excel.

It hasn’t always been easy. Sometimes the information seems abstract or too conceptual. In this case I’ve found out that a visual explanation is best. I really like to use diagrams or videos to explain hard-to-grasp ideas.

Having video, pictures, and text really help to reinforce the point and enable learning.

And now I want to help you get the same results.

The first step is simple, click here http://www.essentialsql.com/get-started-with-sql-server/

Comments and Discussions

 
-- There are no messages in this forum --