Click here to Skip to main content
Click here to Skip to main content

How to use ROW_NUMBER() to enumerate and partition records in SQL Server

By , 31 Dec 2011
 

I had a situation recently where I had a table full of people records, where the people were divided into families. The business logic that needed to be followed was that I had to assign a “Twin Code” to each record. This meant that for each family in the database, if two or more members were born on the same day they should be treated as twins. The twins should be assigned a number enumerating them in order of birth. If the member was not a twin, they should just receive the twin code of 1.

Here’s an example table:

PersonID FamilyID FirstName LastName DateOfBirth
1 1 Joe Johnson 2000-10-23 13:00:00
2 1 Jim Johnson 2001-12-15 05:45:00
3 2 Karly Matthews 2000-05-20 04:00:00
4 2 Kacy Matthews 2000-05-20 04:02:00
5 2 Tom Matthews 2001-09-15 11:52:00

There are lots of ways to achieve the desired result, but the simplest is to just use a simple SELECT statement combined with the ROW_NUMBER() function with a couple parameters as to how to number the rows!

ROW_NUMBER() provides you with the number of a row in a given recordset, where you provide details on how to number the records. For example, if I just had to number the records above based solely upon the date of birth (ignoring families) then I would use this query:

SELECT
     [PersonID]
    ,[FamilyID]
    ,[FirstName]
    ,[LastName]
    ,[DateOfBirth]
    ,ROW_NUMBER() over (ORDER BY DateOfBirth) AS Number
FROM
	People
ORDER BY
	PersonID

This just tells the ROW_NUMBER() function to order its numbering ascending by DateOfBirth. Notice that I apply an order myself later on in the query, which is different than the row_number() order. I would get these results:

PersonID FamilyID FirstName LastName DateOfBirth Number
1 1 Joe Johnson 2000-10-23 13:00:00 3
2 1 Jim Johnson 2001-12-15 05:45:00 5
3 2 Karly Matthews 2000-05-20 04:00:00 1
4 2 Kacy Matthews 2000-05-20 04:02:00 2
5 2 Tom Matthews 2001-09-15 11:52:00 4

The number field that is assigned to each record is in the order of DateOfBirth.

Ordering my numbering by DateOfBirth is just half of the picture. I also need to “group” the records by the FamilyID. This is where a clause in T-, SQL that you might not be very familiar with comes into play: “PARTITION BY”. The PARTITION BY clause allows us to group the results within the call to ROW_NUMBER() without grouping them ourselves via a GROUP BY. It just tells the ROW_NUMBERR what groupings to use when it does its counting.

Here is our final SQL statement, which achieves the business logic we wanted to implement.

SELECT
       [PersonID]
     [FamilyID]
      ,[FirstName]
      ,[LastName]
      ,[DateOfBirth]
      ,ROW_NUMBER() over(PARTITION BY FamilyID,
                         CONVERT(NVARCHAR(25), DateOfBirth, 111)
                         ORDER BY DateOfBirth ASC) TwinCode

  FROM [People]
ORDER BY	PersonID

IIn the ROW_NUMBER function above, I am doing several things. I’m grouping on FamilyID, and also grouping on a converted DateOfBirth. I convert the DateOfBirth to an nvarchar using the 111 conversion code, because that gets results like ‘2009/10/11′ and ‘2009/10/12′ which can easily be grouped by to achieve distinct dates.

Grouping on the Family, DateOfBirth, and then sorting by DateOfBirth ascending achieves the desired result for the ROW_NUMBER. Here are the results of the query:

PersonID FamilyID FirstName LastName DateOfBirth TwinCode
1 1 Joe Johnson 2000-10-23 13:00:00 1
2 1 Jim Johnson 2001-12-15 05:45:00 1
3 2 Karly Matthews 2000-05-20 04:00:00 1
4 2 Kacy Matthews 2000-05-20 04:02:00 2
5 2 Tom Matthews 2001-09-15 11:52:00 1

As you can see, the two people who qualify as twins above (Karly and Kacy) are enumerated correctly, with Karly receiving a 1 and Kacy receiving a 2. All the records that are not twins properly receive a 1.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Dave C Andrews
Software Developer (Senior) LBMC Technologies
United States United States
Member
I've been developing software since I was 9 years old, in BASIC on a Precomputer 2000. OK, that's a stretch, but hopefully some of the stuff that I learned from that little toy have helped my professional career.
 
I have a bachelors degree in computer science from MTSU. Currently, I am employed as a senior software developer for LBMC in Nashville, TN. We serve as consultant developers for clients across the southeastern United States, developing mostly in Microsoft .Net with Microsoft SQL Server.
 
I greatly enjoy my job and my hobbies lie in open source software and in teaching others how to be better programmers. I have a beautiful wife, 2 dogs (a dachshund and an olde-english bulldog) and a cat.

Comments and Discussions

Comment 10 messages have been posted for this article Visit http://www.codeproject.com/Articles/308281/How-to-Use-ROW_NUMBER-to-Enumerate-and-Partition-R to post and view comments on this article, or click here to get a print view with messages.

Permalink | Advertise | Privacy | Mobile
Web03 | 2.6.130513.1 | Last Updated 31 Dec 2011
Article Copyright 2011 by Dave C Andrews
Everything else Copyright © CodeProject, 1999-2013
Terms of Use