Click here to Skip to main content
15,867,686 members
Articles / Database Development / SQL Server
Tip/Trick

Use SQL Server's Ranking Functions Wisely

Rate me:
Please Sign up or sign in to vote.
4.88/5 (5 votes)
3 May 2013CPOL3 min read 40.4K   5   4
Beware of What We Ask Them to Do

Introduction

SQL Server's Ranking functions can be very useful and handy if we use them in a right way and thoughtfully; otherwise, they become evil very fast! In this tip I will talk about my experience with NTILE function but the same story can be expanded to all other ranking functions.

Background

  After spending several hours on why a script was generating different results against same sets of data, I discovered the evil. It was a set of unwise NTILE function calls. For those who may not be familiar with this function, please check this MSDN page. All rank functions contain an over clause which tells the function which column(s) to be used as reference.

The Problem

Here is an example for NTILE and how it becomes evil -

Imagine the following table of data (named MyTable):

NameCost
C1
B1
D4
A1
F3
E2

I picked simle names A to F in order to make it easier to follow the scenario.

Now assume we have the following script:

SQL
SELECT [Name], [Cost], NTILE(6) OVER(ORDER BY [Cost]) AS [Rank] FROM [MyTable]

Running the script agains MyTable you will have the following result set:

NameCostRank
C11
B12
A13
E24
F35
D46

As you can see, NTILE does exactly what it is asked to do. Let's run the same script against the same table with slightly different data order:

NameCost
B1
A1
E2
C1
D4
F3

As you can see, the tables contents are the same but only the order of data has changed. After running the very same script, you will get the following result set:

NameCostRank
B11
A12
C13
E24
F35
D46

Now, let's compare the rankings in two result sets side by side:

Result 1Result 2
NameRankRank
A32
B21
C13
D44
E55
F66

 Can you see the evil? Running the same script against the same data sets with different data order results in totally different outputs. NTILE by itself is not an evil but the way we assumed it would work was wrong. In script we are asking it to order the Cost column and then group them into 6 groups. The only order it considers is Cost because that's what we asked for. Who to blame?!

How to fix it?

When we use rank functions we need to be very careful. We need to consider many things and check what we are asking them to do wisely. Rank functions can receive several orders. What we need to do is that we need to make sure all required orderings are in place. That way we can make sure the data order does not affect our results. In this case in particular, one way to fix the discrepancy is to add a new order to our NTILE function:

SELECT [Name], [Cost], NTILE(6) OVER(ORDER BY [Cost], [Name]) AS [Rank] FROM [MyTable]

 We are adding Name column to the order clause. It does not matter if it as ASC or DESC; what matters is that we will get the same results if we run it against any data set with the same data but different order.

NameCostRank
A11
B12
C13
E24
F35
D46

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Architect
United States United States
I got my BS in Software Engineering from Iran, worked there for 4.5 years mainly in industrial automation field. Then I moved to Australia. In Australia, I had a great chance to work at some big companies. Since 2009 I have been living in the States. I received my MS in Information Systems from Illinois State University. Currently, I am a Senior Software Development Engineer.

Comments and Discussions

 
Question5* Pin
Mas1125-Oct-13 0:20
Mas1125-Oct-13 0:20 
GeneralOverview of Ranking functions Pin
Member 1034286917-Oct-13 9:18
Member 1034286917-Oct-13 9:18 
GeneralRe: Overview of Ranking functions Pin
Tecfield17-Oct-13 9:52
Tecfield17-Oct-13 9:52 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.