Click here to Skip to main content
14,266,079 members

How do the SQL DISTINCT and TOP SELECT Clauses Work Together to Generate Results?

Rate this:
5.00 (1 vote)
Please Sign up or sign in to vote.
5.00 (1 vote)
17 Jun 2015MIT
How do the SQL DISTINCT and TOP SELECT Clauses Work Together to Generate Results?

This article is inspired by a series of questions that one of my readers, Nan, recently sent me regarding DISTINCT, TOP, and ORDER BY.

All the examples for this lesson are based on Microsoft SQL Server Management Studio and the AdventureWorks2012 database. You can get started using these free tools using my Guide Getting Started Using SQL Server.

How does the SQL Top and Distinct SELECT modifiers work together to produce results?

Nan’s Original Question

Here is the question that Nan originally sent me:

I’m a bit confused about SELECT DISTINCT and SELECT. For example,

SELECT   DISTINCT TOP 10 FirstName,
                         LastName
FROM     Person.Person
ORDER BY LastName

Is this looking at distinct first names? Distinct combined first and last names? How do we distinguish between the columns used for the distinct evaluation and columns we just want to show in the output?

What about:

Select Distinct TOP 10 LastName,
       FirstName + ' ' + LastName AS FullName
FROM   Person.Person ORDER BY LastName

I thought everyone would like to know the answer, so I created a blog post.

DISTINCT and TOP – Which is First?

Let’s look at the first statement whose purpose is to return a unique list of first and last names.

SELECT   DISTINCT TOP 10 FirstName,
                         LastName
FROM     Person.Person
ORDER BY LastName; 

TOP 10 will return the first ten items from the ordered set, and DISTINCT will remove any duplicates. The question is which happens first?

  • Is the table sorted by LastName and the top ten items taken, and then duplicate name removed?
  • Or are the duplicates removed, and then the items sorted and the top ten items displayed?

Before we answer this question, keep in mind that DISTINCT operates on all column and expressions in the SELECT clause. So in this case, the statement will return distinct rows for FirstName and LastName.

Unfortunately, there is no direct way to use DISTINCT on one set of fields and display others. Once you add columns to the SELECT statement, they become under the influence of the DISTINCT operator. I say direct, as you could get a distinct list, and then use a INNER JOIN to pull in other columns. There are dangers to doing that though, as the join may reintroduce duplicates.

Adding a TOP clause to DISTINCT is interesting. I wasn’t sure what would happen, but I did some experimenting with the AdventureWorks database and found that the order of processing goes something like so:

  1. Select DISTINCT Values from Table and order
  2. Select the TOP x rows from the results in step 1 and display

If you want to try this yourself, start with:

SELECT   FirstName,
         LastName
FROM     Person.Person
ORDER BY LastName

And notice the results. Keep track of “Kim Ambercombie.” Notice how there are three entries for her name.

Results sorted by LastName

Now run:

SELECT   DISTINCT FirstName,
                  LastName
FROM     Person.Person
ORDER BY LastName

And you’ll see that “Kim Ambercombine” is shown only once.

Unique list ordered by LastName

Then run:

SELECT   DISTINCT TOP 10 FirstName,
                         LastName
FROM     Person.Person
ORDER BY LastName

And you’ll see it returns first 10 unique first and last names as sorted by LastName.

First 10 unique rows ordered by LastName

If you’re wondering which happens first, the DISTINCT or TOP 10 operations, then compare the results from the last two queries.

Notice that the query “DISTINCT TOP 10” includes the first 10 rows from the query from the “DISTINCT” query.

From this, we know a DISTINCT list is first created, and then the TOP 10 items returned.

Query plan showing order of execution

You can also confirm this by showing the query plan. To do so, select Query -> Include Actual Query Plan from the menu before executing the query.

The “Stream Aggregate” icon is for the DISTINCT operation and “Top” for the TOP 10 one.

It may seem somewhat counterintuitive to see DISTINCT listed first within the SELECT statement. Just keep in mind SQL isn’t necessarily processed in the order a human would read it from left to right.

DISTINCT and TOP with SELECT List Expressions

The second portion of Nan’s question related to how expressions are treated with the DISTINCT operator.

Expressions are treated the same as column regarding DISTINCT and TOP. Let’s start with a select statement to get the first name as well as the full, which we create by appending LastName to FirstName.

Also, keep in mind, when using ORDER BY, that the ORDER BY items must appear in the select list when using Distinct. Given this, I have to modify the statement presented in the original question:

SELECT   DISTINCT FirstName,
         FirstName + ' ' + LastName AS FullName
FROM     Person.Person
ORDER BY LastName

Won’t run since LastName isn’t in the SELECT list. Yes, it is part of an expression in the select list, but it's not there on its own. It is valid to order by FullName.

We’ll use this ordering in the examples below.

The statement:

SELECT   FirstName,
         FirstName + ' ' + LastName AS FullName
FROM     Person.Person
ORDER BY FirstName + ' ' + LastName

Returns 19972 rows. When we add Distinct.

SELECT   DISTINCT FirstName,
         FirstName + ' ' + LastName AS FullName
FROM     Person.Person
ORDER BY FirstName + ' ' + LastName

Then 19516 rows are returned. Finally adding Top 10, returns the first 10 distinct name combinations.

SELECT   DISTINCT TOP 10 FirstName,
         FirstName + ' ' + LastName AS FullName
FROM     Person.Person
ORDER BY FirstName + ' ' + LastName

Try running these queries on the AdventureWorks database and you see for yourself the behavior is the same as we find when working exclusively with columns.

License

This article, along with any associated source code and files, is licensed under The MIT License

Share

About the Author

essentialSQL
Easy Computer Academy, LLC
United States United States
Hello my name is Kris. I’m here because I am passionate about helping non-techie people to overcome their fear of learning SQL.

I know what it is like to not know where to start or whether the time spent learning is worth the effort. That is why I am here to help you to:
- Get started in an easy to follow step-by-step manner.
- Use your time wisely so you focus on what is important to learn to get the most value from your time.
- Answer your questions. Really! Just post a comment and I’ll respond. I’m here to help.

It wasn’t long ago that I was helping a colleague with some reporting. She didn’t know where to start and soon got overwhelmed and lost as she didn’t know SQL.

I felt really bad, as she was under pressure to get some summary information to her boss, the built-in reports were falling short, and to make them better would require her to know SQL. At that time that seemed impossible! It in dawned on me, it doesn’t have to be that way.

Then I discovered a way for anyone with the desire to easily learn SQL. I worked with my co-worker, started to teach her what I learned and soon she was able to write reports and answer her boss’ questions without getting stressed or ploughing hours into manipulating data in Excel.

It hasn’t always been easy. Sometimes the information seems abstract or too conceptual. In this case I’ve found out that a visual explanation is best. I really like to use diagrams or videos to explain hard-to-grasp ideas.

Having video, pictures, and text really help to reinforce the point and enable learning.

And now I want to help you get the same results.

The first step is simple, click here ==> http://www.essentialsql.com/get-started-with-sql-server/

Comments and Discussions

 
AnswerThanks for sharing! Pin
Liju Sankar17-Jul-15 6:24
professionalLiju Sankar17-Jul-15 6:24 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Technical Blog
Posted 17 Jun 2015

Tagged as

Stats

27.5K views
6 bookmarked