Click here to Skip to main content
15,892,059 members
Articles / Programming Languages / C#

Controlling Collation Order On DB2/400

Rate me:
Please Sign up or sign in to vote.
0.00/5 (No votes)
13 Apr 2010CPOL4 min read 17.5K   5  
How to control collation order on DB2/400

I recently had the need to control collation order on DB2/400. Essentially, I had a multi-column data grid on a web page and one of the columns contained data that was alphanumeric. The data in the data grid was ordered on this column. By default, on DB2/400, the collation order for sorting the table I was querying is determined by the hexadecimal EBCDIC value. This caused letters to come before numbers like this:

C
D
1
9

I had a request to change the order so that numbers came before letters so that this list would instead look like this:

1
9
C
D

It turns out that with DB2/400, this was quite simple. But first, here are some search terms to help Google find this article for you.

DB2/400 Collation Order
DB2/400 Collation Algorithm
DB2/400 Sort Sequence Table
IBMDA400 Collation Order
IBMDA400 Collation Algorithm
IBMDA400 Sort Sequence Table
.NET Native Provider Collation Order
.NET Native Provider Collation Algorithm
.NET Native Provider Sort Sequence Table
iSeries Collation Order
iSeries Collation Algorithm
iSeries Sort Sequence Table

The key to solving this problem on DB2 is using a sort sequence table. It turns out that there are some already available. By default, I was getting the HEX sort sequence table. This causes the order to be determined by each character's EBCDIC value. There are two other builtin sort sequence tables as well, LANGIDSHR and LANGIDUNQ.

LANGIDSHR causes both the uppercase and lowercase value for each letter to have the same sort sequence (shared -> SHR). Thus, instead of this order:

a
b
A
B

you would get this order:

a
A
b
B

or possibly even:

A
a
B
b

Since a and A have the same sort sequence value, they could come in either order.

LANGIDUNQ causes the uppercase and lowercase value for each letter to be sorted together, but still keeps their sort sequence slightly different (unique -> UNQ), causing lowercase to come before uppercase, like this:

a
A
b
B

What is really cool about this is that you can also create your very own custom sort sequence, and you can use one of the builtin ones as the starting point. Or you can view one of the builtin ones. To do either of these activities, read this article.

Make sure you specify *PROMPT for the Source File, *SRTSEQ for the Table type, and specify a Basing sort sequence of *HEX, *LANGIDSHR, *LANGIDUNQ, etc.

You should be aware that the link above also shows how to create conversion tables too, like EBCDIC to ASCII, etc.

So, after discovering the link above, I had initially thought I would need to create my own custom sort sequence where I specified that numbers would come before letters. However, it turns out that LANGIDSHR and LANGIDUNQ already do that too! So all I had to do was specify one of those sort sequences. I chose to use LANGIDUNQ because fundamentally, I like the idea of the uppercase and lowercase being ordered differently just for visual grouping so that I don't end up with this:

a
A
a
B
b
B

By the lowercase and uppercase having (slightly) different sort sequences, I would get this instead:

a
a
A
b
B
B

This is visually more appealing to me.

Now my only obstacle was to somehow specify the sort sequence table to be used. My preference would have been to be able to provide it in the SQL query itself, but I never found a way to do that. This would have allowed me to control the ordering for just this query without risking any other queries.

However, I did find how to control the sort sequence table through the connection string. And, if I wanted to control the collation order on a single SQL query, I could always use a different connection string for that query only.

IBMDA400

For the IBMDA400 provider, use the Sort Sequence and Sort Language ID properties. You can find them documented here.

For the LANGIDUNQ sort sequence, I simply provide the value of 2, like this: Sort Sequence=2

When you use either the LANGIDUNQ or LANGIDSHR sort sequence with IBMDA400, you must also specify the Sort Language ID property. For me, it is English, which is ENU, like this: Sort Language ID=ENU;

So my connection string for IBMDA400 looks like this:

Provider=IBMDA400; Data Source=<server>; User ID=<user>;
Password=<password>; Force Translate=37; Sort Sequence=2; Sort Language ID=ENU;

Of course, I have the appropriate server, user, and password specified.

.NET DB2 Native Provider (iDB2Connection)

You can also specify the sort sequence when using the .NET native provider. For the .NET native provider, you specify the SortSequence and SortLanguageId properties. Notice these don't have spaces in them, unlike the IBMDA400 provider properties. Here is what that portion of your connection string should look like:

SortSequence=UniqueWeight;SortLanguageId=ENU;

You can read all about the .NET native provider properties in this excellent document here.

Just so you are aware, I haven't tested this with the .NET native provider which is why I am also not providing the full connection string. I have however verified the IBMDA400 connection string properties.

As a .NET developer, I must say that I am impressed with how simple IBM has made controlling the collation order. This turned out to be quite simple to do and is a nice trick to have in your bag.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Technical Lead Genuine Parts Company
United States United States
Author: Pro LINQ: Language Integrated Query in C# 2008
Co-author: Pro LINQ: Language Integrated Query in VB 2008

Comments and Discussions

 
-- There are no messages in this forum --