People ask me a lot about subquery so I thought let’s answer with a article touching a bit of everything about subquery.
Before discussing SubQuery, there are few things which we should know like:
What is a Query?
I will go with its English Definition first. As a verb, “Put a question or questions to”.
Technical definition w.r.t. SQL “These are the commands issued to a database for retrieval of required information.” If we see closely,both the definitions complement each other. For example:
select all data the in employees table.
select * from employees
Lets put a condition like- select all employees with salary>5000
select * from employee where salary>5000
What is a SubQuery?
SubQuery can be treated as a ‘query on query’. A subquery is the inner query which provides a targeted result to the outer main query . We can try few examples to learn it
select employee name with its manager name
select emp.name,(select mgr.name from employee AS mgr
where emp.mgrid=mgr.empid) from employee AS emp
We mostly see subqueries in where clause like – select employees having average salary
select * from employee where salary=(select AVG(salary) from employee)
What is Correlated SubQuery?
A correlated sub-query is a sub-query that uses values from the outer query in its WHERE clause.Let’s try with an example
select employees having salary greater than average salary of employees of department ‘IT’
select * from employee where salary=(select AVG(salary) from employee
The main difference is that the subquery will be executed for each row before the result can be used by outer query.
Why do we require SubQuery or advantages of SubQuery?
SubQuery holds the results like a temporary table which can be used by outer query.
SubQuery are easier to understand
SubQuery breaks down a complex query into small and simple queries.
SubQuery are easy to use as a replacement of joins.There is no major difference in performance.
A subquery is subject to the following restrictions:
- Up to 32 levels of nesting is possible, although the limit varies based on available memory and the complexity of other expressions in the query
- If a table appears only in a subquery and not in the outer query, then columns from that table cannot be included in the output
- The select list of a subquery introduced with a comparison operator can include only one expression or column name (except that EXISTS and IN operate on SELECT * or a list, respectively).
- If the WHERE clause of an outer query includes a column name, it must be join-compatible with the column in the subquery select list.
- The ntext, text, and image data types cannot be used in the select list of subqueries.
- Because they must return a single value, subqueries introduced by an unmodified comparison operator (one not followed by the keyword ANY or ALL) cannot include GROUP BY and HAVING clauses.
- The DISTINCT keyword cannot be used with subqueries that include GROUP BY.
- The COMPUTE and INTO clauses cannot be specified.
- ORDER BY can only be specified when TOP is also specified.
- A view created by using a subquery cannot be updated.
- The select list of a subquery introduced with EXISTS, by convention, has an asterisk (*) instead of a single column name. The rules for a subquery introduced with EXISTS are the same as those for a standard select list, because a subquery introduced with EXISTS creates an existence test and returns TRUE or FALSE, instead of data.
Join Vs SubQueries
I was looking for this answer and though it’s not a verified answer but yes,it’s true in most cases. refer this :http://stackoverflow.com/questions/2577174/join-vs-subquery
In most cases JOINs are faster than sub-queries and it is very rare for a sub-query to be faster.
In JOINs RDBMS can create an execution plan that is better for your query and can predict what data should be loaded to be processed and save time, unlike the sub-query where it will run all the queries and load all their data to do the processing.
The good thing in sub-queries is that they are more readable than JOINs: that’s why most new SQL people prefer them; it is the easy way; but when it comes to performance, JOINS are better in most cases even though they are not hard to read too.
I am not building something new but assembled all the questions and answers related to subquery which keeps on bugging me day and night. I hope this will be useful for people looking for answers at one place.please mail me at firstname.lastname@example.org