Introduction
The common way of fetching ordered sets of data from a relational database is to use the SQL clause ORDER BY
. This, used in conjunction with a SELECT
statement, orders the entire set of result data based on the values in the columns indicated by the ORDER BY
clause. Let’s consider the following table (named STUDENT) as an example:
Name  Coefficient 

John  10 
Michael  12 
George  8 
Steve  8 
A query like “SELECT * FROM STUDENT ORDER BY coefficient DESC
” will return the data sorted after the value of the coefficient, in descending order:
Michael  12 
John  10 
George  8 
Steve  8 
This method of sorting data is very simple and easy to use, and has the advantage of being part of the SQL standard and thus available on virtually any relational database provided out of the box. It has one significant drawback though: it can only be effectively used when the values in the column(s) based on which we want to sort the data contain the information needed for the sorting process. In real life situations, we may desire to obtain data in an order that has nothing to do with the data itself. It may be that the user entered it in a certain order and he/she expects to see it coming back in the same order, or any other situation in which we may want to be able to obtain the records in a table in a certain order and that order cannot be deduced (calculated) from the data itself. In other words, a simple ORDER BY
clause would not be enough in these cases. Let’s take an example.
(Table MASTERTABLE)
ItemGroup (PK) 

48E9405C45EE4BB3A3DCD3918898A815 
40AB772255184B20A4A8EDD89B59536F 
0826DE86308148F89CACEBD8A043F4DC 
(Table DETAILTABLE)
ItemGroup (FK to MASTER.ItemGroup)  Item (PK) 

48E9405C45EE4BB3A3DCD3918898A815  6BF71715E18640419BA9023585E55B55 
48E9405C45EE4BB3A3DCD3918898A815  C41908F6E896409A95ABD3EC81550B0D 
48E9405C45EE4BB3A3DCD3918898A815  4D95851F6F7D4CEB8C4AC7ABDAC429D9 
48E9405C45EE4BB3A3DCD3918898A815  4D467B250ECD46D3B99857CAE00BC967 
40AB772255184B20A4A8EDD89B59536F  6F9B9D06515D450E990CD9AEBEE9DCCF 
40AB772255184B20A4A8EDD89B59536F  6BF71715E18640419BA9023585E55B55 
0826DE86308148F89CACEBD8A043F4DC  63E3F1F9229C41F889C05F66EB0E441 
0826DE86308148F89CACEBD8A043F4DC  13329EB5387649C6B42DC2FDDE1E5700 
All columns are of type RAW(16)
and the values that you can see inside are GUID (Globally Unique Identifier) values. They have no meaning in the real world by themselves, they are just blind identifiers. This database has a number of categories (in the MASTERTABLE table), and for each of these categories, we can have zero or more items (in the DETAILTABLE table). We have no additional identifying information except these identifiers. We can assume that such information is external to this database, or, even if not, it may still not be feasible or desirable to sort based on it. Let’s assume we want to maintain an order for all the items inside a given item group. How do we do that?
“SELECT * FROM DETAILTABLE WHERE ItemGroup = <SOME_ITEMGROUP> ORDER BY Item
” will not cause any errors, but will do nothing useful (it will sort the values lexicographically, which is not what we are after).
We propose a number of techniques that will help solve this and some related problems as follows.
Method #1. Additional column with dispersed integer values
This method implies adding to the DETAILTABLE table a new column, with meta information about the Item IDs. Let’s call this column Rank
. This meta information is simply a positive integer number, such that “SELECT … WHERE ITEMGROUP = <SOME_ITEMGROUP> ORDER BY Rank
” will produce the desired order. At insert time, a rank value will be chosen so that the item just inserted occupies the desired position in the ordered list of items for that particular item group. In our example, we are only interested in sorting items inside their own item group (and an item, being a PK, will only be part of one item group), so these rank numbers have meaning at an item group scope rather than at the scope of the DETAILTABLE table. The technical details of the stored procedure we propose for this technique are described as follows. There shall be an alternate key for the columns (ItemgGroup
, Rank
) to enforce uniqueness amongst the ranks of the same item group.
First, let MAXINT
denote the maximum integer value supported by the particular DB you are using. For portability issues, one may want to choose a conservative value for MAXINT
, one that will be guaranteed to work on all relational databases.
Let us assume that the item ITEM<SUB>1</SUB>
is the first item inserted in the DETAILTABLE table for the item group ITEMGROUP<SUB>M</SUB>
. Its rank shall be chosen as the value of ROUND (MAXINT / 2)
.
Let us now assume that we have a series of N
items entered for the item group ITEMGROUP<SUB>M</SUB>
in the DETAIL table, ITEM<SUB>1</SUB>
, ITEM<SUB>2</SUB>
… ITEM<SUB>N</SUB>
and that their ranks are RANK<SUB>1</SUB>
, RANK<SUB>2</SUB>
… RANK<SUB>N</SUB>
, respectively. We wish to position this new ITEM
as the direct successor of ITEM<SUB>K</SUB>
, 0 <= K < N
(0 as a special case, when we want this new item to be the first, while RANK<SUB>N+1</SUB>
is considered to be MAXINT). We shall choose its rank as the value of:
ROUND ((RAN<SUB>K</SUB> + (RANK<SUB>K+1</SUB>  RANK<SUB>K</SUB>) / 2))
, where RANK<SUB>0</SUB>
is defined to be 0. If we want to insert the new item as the first one, its rank shall be ROUND (RANK<SUB>1</SUB> / 2)
, and if we want it to be the last one, its rank shall be ROUND ((MAXINT – RANK<SUB>N</SUB>) / 2)
.
It can be easily seen that RANK<SUB>K</SUB> < RANK<SUB>P</SUB>
for all K < P
.
If a large number of items are inserted in a particular place in the list (e.g., all are inserted, one by one, as the immediate successor of the same existing item), we may arrive at a point where RANK<SUB>K+1</SUB>  RANK<SUB>K</SUB> = 1
for some K
(ITEM<SUB>K</SUB>
being the preferred immediate predecessor we mentioned earlier). It effectively means that no new ITEM<SUB>P</SUB>
can be inserted such that RANK<SUB>K</SUB> < RANK<SUB>P</SUB> < RANK<SUB>K+1</SUB>
. We fix the situation by redistributing the rank values over all the items in the item group. Let us call global redistribution the following simple method of rank redistribution:
ITEMGROUP // user given argument
N //number of items for ITEMGROUP
RANK[] RANKS = “SELECT RANK, ITEM FROM DETAIL
WHERE DETAIL.ITEMGROUP = ITEMGROUP
ORDER BY RANK ASC”
integer index = 1;
integer PREVIOUS = RANKS [1]; //first rank
for each RANK R in RANKS
{
R = ROUND ((MAXINT / (N+1))) * index;
update R to db;
index++;
}
Example:
Let us assume we have N
= 8 items and their ranks are: 3, 5, 7, 7, 7, 10, 11 and 14. They will get redistributed to:
ROUND (MAXINT / 9), ROUND ((MAXINT / 9)) * 2, ROUND ((MAXINT / 9)) * 3, ROUND ((MAXINT / 9)) * 3, ROUND ((MAXINT / 9)) * 3, ROUND ((MAXINT / 9)) * 6, ROUND ((MAXINT / 9)) * 7, ROUND ((MAXINT / 9)) * 8
.
It is observed immediately that equal ranks get redistributed to equal ranks, and that if the order of multiplicity of rank K
is M<SUB>1</SUB>
, and the order of multiplicity of rank P
is M<SUB>2</SUB>
. After redistribution, we shall have:
ROUND ((RANK<SUB>K+1</SUB>  RANK<SUB>K</SUB>) / (RANK<SUB>P+1</SUB>  RANK<SUB>P</SUB>)) = ROUND (M<SUB>1</SUB> / M<SUB>2</SUB>);
(with the limit case of RANK<SUB>K+1</SUB>
being MAXINT
if K
is the maximum rank).
The advantage of this redistribution method is that it guarantees a reasonable “widening” of the “space” between ranks, without knowing or caring about where the rank value density is higher than the average. Unfortunately, this simple redistribution strategy does not use any information that might be available about the expected rank distribution over the items.
Of course, the redistribution methods can be implemented inside or outside the database. As an example, the article provides a PL/SQL implementation for future reference. The body of the stored procedure that handles the rank redistribution is printed below:
PROCEDURE
redistranksglobally (itemgroupguid IN RAW)
IS
n NUMBER (38, 0);
f NUMBER (38, 0);
ceils NUMBER (38, 0);
newrank NUMBER (38, 0);
BEGIN
SELECT COUNT (1)
INTO n
FROM detailtable_disp_total
WHERE detailtable_disp_total.itemgroup = itemgroupguid;
IF n > 0
THEN
f := FLOOR (common.maxint / (n + 1));
ceils := common.maxint  f * (n + 1);
newrank := 0;
FOR r IN ranks (itemgroupguid)
LOOP
newrank := newrank + f;
IF ceils > 0
THEN
newrank := newrank + 1;
ceils := ceils  1;
END IF;
UPDATE detailtable_disp_total
SET detailtable_disp_total.RANK = newrank
WHERE CURRENT OF ranks;
END LOOP;
END IF;
END;
Partially ordered ranks
If one only desires partial order amongst the ranks, he/she can choose to do that. The alternate key for the columns (ItemgGroup
, Rank
) shall have to be removed and additional insertion and redistribution semantics shall have to be devised. An example on how this can be done is present in the reference implementation attached to this article.
The main advantage of this method is that, by starting with [1 ... MAXINT]
as our interval for rank values and by distributing the ranks far from one another, many inserts of new items in the DB can happen before any rank redistribution is needed, and many more will happen before a new redistribution is needed. These properties make this technique appropriate for highly mutable tables.
The main drawback presented by this method consists of the introduction of a sorting complexity for finding out the n^{th} item from a given item group.
The next technique is somehow complementary of this one, in that it makes ranks meaningful as indexes, but will require at least partial rank redistribution for the vast majority of insert operations. Also, the next technique does not allow partial ordering.
Method #2. Additional column with consecutive integer values
This method implies adding a new column to the DETAILTABLE table, with meta information about the Item IDs. Let’s call this column Rank
. This meta information is simply a positive integer number, such that “SELECT … WHERE ITEMGROUP = <SOME_ITEMGROUP> ORDER BY Rank
” will produce the desired order. In this case, the ranks start from 1 and have consecutive values. It follows immediately that the maximum rank for a given item group is the item group's item count.
At insert time, a rank value will be chosen so that the item just inserted will occupy the desired position in the ordered list of items for a particular item group. In our example, we are only interested in sorting items inside their item group (and an item, being a PK, will only be part of one item group), so these rank numbers have meaning at an item group scope, but not at the scope of the whole DETAILTABLE table. Total order is required, which means that no two different items can have the same rank.
The first inserted item for an item group will have the rank set to 1.
Let us now assume that we have a series of N
items belonging to the item group ITEMGROUP<SUB>M</SUB>
in the DETAIL table, ITEM<SUB>1</SUB>
, ITEM<SUB>2</SUB>
… ITEM<SUB>N</SUB>
and that their ranks are RANK<SUB>1</SUB>
, RANK<SUB>2</SUB>
… RANK<SUB>N</SUB>
, respectively. We know that RANK<SUB>K</SUB> = RANK<SUB>K1</SUB> + 1
, with the exception of RANK<SUB>1</SUB>
, which has no predecessor. If we desire to insert the new item at the end of the list, that is done by setting its rank to N + 1
. If not, let us assume that the desired predecessor for the newly inserted item has the rank K
, with 0 <= K < N
where 0
means we want it first on the list. This is achieved in two simple steps. First, we increment all the ranks which are strictly larger than K
. Second, we insert the new item with the rank of K + 1
.
We follow with a description of a set of database constraints that have the role of assuring that the ranks have positive consecutive integer values, starting with 1. Let us return to the previously defined table, DETAILTABLE.
ItemGroup (FK to MASTER.ItemGroup)  Item (PK)  Rank  RankMinus1 
48E9405C45EE4BB3A3DCD3918898A815  6BF71715E18640419BA9023585E55B55  1  
48E9405C45EE4BB3A3DCD3918898A815  C41908F6E896409A95ABD3EC81550B0D  2  1 
48E9405C45EE4BB3A3DCD3918898A815  4D95851F6F7D4CEB8C4AC7ABDAC429D9  3  2 
48E9405C45EE4BB3A3DCD3918898A815  4D467B250ECD46D3B99857CAE00BC967  4  3 
40AB772255184B20A4A8EDD89B59536F  6F9B9D06515D450E990CD9AEBEE9DCCF  1  
40AB772255184B20A4A8EDD89B59536F  6BF71715E18640419BA9023585E55B55  2  1 
0826DE86308148F89CACEBD8A043F4DC  63E3F1F9229C41F889C05F66EB0E441  1  
0826DE86308148F89CACEBD8A043F4DC  13329EB5387649C6B42DC2FDDE1E5700  2  
We add the columns Rank
and RankMinus1
, which have an integer data type. For each rank, the corresponding value in the RankMinus1
column is rank – 1. We shall return to why this is this way at a later point. When we have a rank of 1, the corresponding entry in the RankMinus1
column is NULL
(to be noted that RankMinus1
is nullable).
A constraint shall be added to the table DETAIL, in the following form:
((RANK =1 AND rankminusone IS NULL)OR RANK>1 AND rankminusone=RANK1)
The constraint will be made deferrable and initially deferred, so as to allow for manipulation of ranks during the insertion of new items.
A foreign key from RankMinus1
shall be added to point to Rank
, to make sure that no bogus values will be inserted in the RankMinus1
column. Also, alternate keys will be introduced to ensure that the combinations (ItemGroup
, Rank
) and (ItemGroup
, RankMinus1
) are unique.
With the addition of this constraint mechanism, new steps will have to be added when inserting new items. Specifically, the values in the column RankMinus1
also have to be updated. This is trivial to do, and no detailed explanation shall be given here.
The advantage of this method is that the ranks provide not only order, but also information on the index of a certain item. One can find out “the n^{th} element in the table” by doing “SELECT * FROM DETAIL WHERE Rank = n
”.
The disadvantage is that an insertion of a new item with a rank of K
will require for the updating of N – K
ranks and corresponding N – K
values from the RankMinus1
column.
An example on how this can be done is present in the reference implementation attached to this article.
<! xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx >
Isolating the ordering information in the master table
We have seen that keeping the ordering information in the detail table results in lots of update operations when that order changes. In the following sections, we introduce a way of storing that information in the master table. We also analyze when the proposed method is feasible and when not, and we discuss a less efficient way of achieving similar results, but without so strong limitations on the item count.
Background  The factorial number system
The factorial n!
is defined as n! = n(n1)...1
where n
is a natural number. The factorial n!
is the number of ways in which n
distinct objects can be permuted. Because an empty set has a single permutation, it is considered that 0! = 1.
In order to indicate how big the factorial numbers actually are, we introduce some limits that are a consequence of Stirling's approximation:
Proposition 1 [Robbins 1955, Feller 1968]
The factorial function is bounded by the following double inequality:
 <FONT size=+2>Ö

2p
 n^{n+[1/2]}e^{n+[1/(12n+1)]} < n! <  <FONT size=+2>Ö

2p
 n^{n+[1/2]}e^{n+[1/12n]} 

Proposition 2 [Gosper]
For n
large it holds true that:
n! »  <FONT size=+2> æ Ö

 n^{n}e^{n} 

As opposed to our traditional geometric radix number systems where the denominations of successive "places" form geometric series, e.g., 1, 10, 100, ... or 1, 2, 4, 8, 16, ... the factorial number system's denominations are the factorial numbers 1, 2, 6, 24, 120, ..... Thus the factorial number system belongs to the item group of mixedradix representation systems.
The following proposition is very important in the existence of the factorial number system:
Proposition 3
The identity 0 * 0! + 1 * 1! + 2 * 2! + 3 * 3! + ... + k * k! = (k + 1)!  1
holds true for all k ≥ 0
.
Proposition 4
Let t ≥ 2
be a natural number. Every integer in the range 0 ≤ f ≤ t!
can be uniquely written in the form:
f = (...(c<SUB>t<FONT face=symbol></FONT>1</SUB>(t<FONT face=symbol></FONT>1)+c<SUB>t<FONT face=symbol></FONT>2</SUB>)(t<FONT face=symbol></FONT>2)+...+c<SUB>2</SUB>)2+c<SUB>1</SUB> 

Expressed in another form,
f = (t<FONT face=symbol></FONT>1)!c<SUB>t<FONT face=symbol></FONT>1</SUB>+(t<FONT face=symbol></FONT>2)!c<SUB>t<FONT face=symbol></FONT>2</SUB>+...+2!c<SUB>2</SUB>+1!c<SUB>1</SUB> 

where the "digits" c<SUB>j</SUB>
are integers satisfying 0 ≤ c<SUB>j</SUB> ≤ j
, for 1 ≤ j < t
.
The factorial number system enables us to represent the ordering of a set of t
elements as a single natural number less than t!
. This is optimal in the sense that there are exactly t!
ways to arrange the elements of a t
cardinality set.
Method #3. Permutations packed as natural numbers
Let (U<SUB>1</SUB>, ..., U<SUB>t</SUB>
) be a sequence of t
distinct identifying keys for the items to represent where U<SUB>i</SUB> <FONT face=symbol>Î</FONT> U
. These can constitute a primary key or a unique index in the detail table. We assume that the database imposes a total order on the set U
which we call a "natural ordering", and that is generally true, be them GUIDs, an autonumber field, or anything else that can be used for identification purposes in the most widely used relational databases.
The ORDER BY
SQL clause always orders the items based on the database's natural ordering imposed on the column's data type. The sequence (U<SUB>1</SUB>, ..., U<SUB>t</SUB>
) can be represented starting from the database's natural ordering and applying a permutation, and vice versa. The following algorithm computes the inverse permutation as a composition of transpositions and represents the permutation as a natural number. More details about variants of the following two algorithms can be found in Art of Computer Programming, Volume 2: Seminumerical Algorithms by Donald E. Knuth, section 3.3.2:
Algorithm 1 [Analyze a permutation]
We compute a bijective map f : P<SUB>t</SUB> <FONT face=symbol>®</FONT> [0, t!) <FONT face=symbol>Ç</FONT> Z
where P<SUB>t</SUB>
is the set of all the possible permutations of (1, ..., t
). Let <FONT face=symbol>s</FONT>
denote such a permutation.
 P1 Let
r <FONT face=symbol>¬</FONT> t
, f <FONT face=symbol>¬</FONT> 0
.
 P2 Find the (unique)
s
position such that <FONT face=symbol>s</FONT><SUB>s</SUB> = r
. Set f <FONT face=symbol>¬</FONT> r * f + s  1
.
 P3 Exchange
<FONT face=symbol>s</FONT><SUB>r</SUB>
with <FONT face=symbol>s</FONT><SUB>s</SUB>
.
 P4 Decrement
r
. If r > 1
then go to step P2.
Note that the <FONT face=symbol>s</FONT>
permutation becomes the identity permutation when the algorithm stops. To prove that the function is indeed bijective, we run the algorithm backwards:
Algorithm 2 [Backwards]
for r = 2, 3, ..., t
s <FONT face=symbol>¬</FONT> f mod r + 1
f = floor([f/r])
 Exchange
<FONT face=symbol>s</FONT><SUB>s</SUB>
with <FONT face=symbol>s</FONT><SUB>r</SUB>
.
The relationship between the factorial number system and algorithm 1 is the identity c<SUB>r  1</SUB> = s
every time step P2 is performed for any value of r
.
Note that these two algorithms can run directly on the sequence (U<SUB>1</SUB>, ..., U<SUB>t</SUB>
) rather than using the permutation <FONT face=symbol>s</FONT>
. In this case, instead of looking for <FONT face=symbol>s</FONT><SUB>s</SUB> = r
, one has to find out the maximum element of the sequence (U<SUB>1</SUB>, ..., U<SUB>t</SUB>
).
If the items were indexed by an autonumber column, then appending an item to the end of the sequence could simply be achieved using:
PERMUTATION <FONT face=symbol>¬</FONT> t! t + PERMUTATION
In the source associated to this article, a stored function with the signature:
FUNCTION arrangecategoryitems (initems IN rawtable) RETURN NUMBER
is provided that computes the number representation of the permutation. The implementation closely follows the algorithm described above:
items := initems;
i := 0;
s := 1;
t := items.COUNT;
r := t;
f := 0;
LOOP
s := 1;
i := 0;
LOOP
i := i + 1;
EXIT WHEN i > r;
IF items (i) > items (s)
THEN
s := i;
END IF;
END LOOP;
aux := items (s);
items (s) := items (r);
items (r) := aux;
f := r * f + s  1;
r := r  1;
EXIT WHEN r <= 1;
NULL;
END LOOP;
RETURN f;
A stored function that extracts the ordering information from the PERMUTATION
column is also included:
FUNCTION reversearrangecategoryitems (
permutation IN NUMBER,
items IN rawtable
)
RETURN rawtable
The implementation closely follows the backwards algorithm:
RESULT := items;
i := 0;
s := 1;
t := items.COUNT;
f := permutation;
r := 1;
LOOP
r := r + 1;
EXIT WHEN r > t;
s := MOD (f, r);
f := FLOOR (f / r);
aux := RESULT (s + 1);
RESULT (s + 1) := RESULT (r);
RESULT (r) := aux;
END LOOP;
LOOP
i := i + 1;
EXIT WHEN i > RESULT.COUNT;
END LOOP;
RETURN RESULT;
Limitations of this method
As stated above, this method permits the storage of the permutation information in the master table rather than the detail table. The ordering information for at most t
items is stored as an integer number less than t!
. When the ordering changes, there is a single field in the master table to change, and this is much cheaper than the other methods presented herein.
But let's find out what is the maximum number of items supported using the integer data types supported by (some) databases. We add the PERMUTATION
column to the master table. In Oracle, we recommend using the NUMBER(38, 0)
data type. In SQL Server, the equivalent is DECIMAL(38, 0)
. In both cases, MAXINT = 10<SUP>39</SUP>  1
while the greatest number t
such that t! < MAXINT is t = 34
. So, we can only support 34 detail items for each master record on Oracle and SQL Server using this method.
Method #4. Overpassing the above limitations
Rather than using an integer, one can store the sequence c<SUB>1</SUB>,c<SUB>2</SUB>, ..., c<SUB>t2</SUB>, c<SUB>t1</SUB>
packed into a byte sequence data type, that is RAW
on Oracle and VARBINARY
on SQL Server. Because 0 <FONT face=symbol>£</FONT> c<SUB>j</SUB> <FONT face=symbol>£</FONT> j
, one can use a very simple byte packing strategy to represent the sequence as a byte sequence data type: every c<SUB>j</SUB>
will occupy ceil(log<SUB>2</SUB>(j+1))
bits in the PERMUTATION
column. The number of bits needed to express the sequence packed like described above is:
S_{t} =  t  1<!sup
> <FONT face=symbol size=+3>å j = 1
 ceil(log_{2}(j + 1)). 

Some loose bounds for this sum appear immediately if we use the double inequality log2(t!) ≤ St ≤ t + log2(t!)
:
 1
2
 (1 + log_{2}(p)) + (t +  1
2
 ) log_{2}(t)  (t   1
12 t + 1
 )log_{2}(e) £ S_{t} 

S_{t} £ t  1 +  1
2
 (1 + log_{2}(p)) + (t+  1
2
 ) log_{2}(t)  (t   1
12 t
 )log_{2}(e) 

Using the upper bound, one can easily compute how many bits (rather than bytes) to allocate for the PERMUTATION
column. In order to compute the byte size, one has to divide the upper bound by 8 and compute the ceiling of the result.
But we can also compute the exact size required to store the sequence c<SUB>1</SUB>, c<SUB>2</SUB>, ..., c<SUB>t  2</SUB>, c<SUB>t  1</SUB>
. Let's define:
m(t) = the greatest integer m such that 2<SUP>0</SUP> + 2<SUP>1</SUP> + ... + 2<SUP>(m  1)</SUP> <FONT face=symbol>£</FONT> t  1
An equivalent expression for m(t)
is:
m(t) = floor(log2(t))
Now, let's compute the sum.
S_{t} =  m(t)<!sup
> <FONT face=symbol size=+3>å j = 1
 j 2^{j  1} + (m(t) + 1)(t  2^{m(t)}) 

In Concrete Mathematics by Ronald L. Graham, Donald E. Knuth and Oren Patashnik, the chapter of sums contains, amongst others, the following sum:
 n<!sup
> <FONT face=symbol size=+3>å k = 0
 >k 2^{k} = (n  1) 2^{n+1} + 2. 

Using the above formula, we get:
St = (m(t)  1) 2<SUP>m(t)</SUP> + 1 + (m(t) + 1)(t  2<SUP>m(t)</SUP>)
or
S<SUB>t</SUB> = t (m(t) + 1) + 1  2<SUP>m(t) + 1</SUP>
Roughly speaking, S<SUB>t</SUB> = O(t log(t))
.
Using the Oracle RAW(2000)
data type, the maximum number of items supported is t = 1640
, while using SQL Server with BINARY(8000)
or VARBINARY(8000)
, the maximum number of items supported is t = 5553
.
For demonstrative purposes, we include below a table of computed sizes expressed in bytes:
t  size(PERMUTATION) in bytes 

10  4 
20  9 
30  15 
40  23 
50  30 
60  38 
70  46 
80  55 
90  63 
100  72 
200  169 
300  274 
400  387 
500  499 
600  623 
700  748 
800  873 
900  998 
1000  1123 
2000  2495 
3000  3989 
4000  5489 
5000  7102 
6000  8727 
7000  10352 
8000  11977 
9000  13703 
10000  15453 
In the source code associated with this article, a stored function with the signature:
FUNCTION arrangecategoryitems (initems IN rawtable)
RETURN RAW
that computes the PERMUTATION
column's value is included. Note that the implementation uses the UTL_RAW
Oracle package for byte manipulation:
RESULT := UTL_RAW.copies ('00', 2000);
bitposition := 0;
bitsize := 1;
samesizecount := 1;
samesizeindex := 1;
items := initems;
t := items.COUNT;
r := t;
LOOP
s := 1;
i := 0;
LOOP
i := i + 1;
EXIT WHEN i > r;
IF items (i) > items (s)
THEN
s := i;
END IF;
END LOOP;
auxitem := items (s);
items (s) := items (r);
items (r) := auxitem;
x := (s  1) * POWER (2, (24  bitsize  (bitposition MOD 8)));
xasraw := UTL_RAW.SUBSTR (UTL_RAW.cast_from_binary_integer (x), 2, 3);
curent := FLOOR (bitposition / 8);
IF curent > 0
THEN
aux := UTL_RAW.copies ('00', curent);
ELSE
aux := NULL;
END IF;
aux := UTL_RAW.CONCAT (aux, xasraw);
RESULT := UTL_RAW.bit_or (RESULT, aux);
bitposition := bitposition + bitsize;
samesizeindex := samesizeindex + 1;
IF (samesizeindex > samesizecount)
THEN
BEGIN
samesizeindex := 1;
samesizecount := samesizecount * 2;
bitsize := bitsize + 1;
END;
END IF;
r := r  1;
EXIT WHEN r <= 1;
NULL;
END LOOP;
A stored function that extracts the ordering information from the PERMUTATION
column is also implemented with the signature:
FUNCTION reversearrangecategoryitems (permutation IN
RAW, items IN rawtable)
RETURN rawtable
Again, a lot of bit manipulation was needed to implement this function:
bitposition := 0;
bitsize := 1;
samesizecount := 1;
samesizeindex := 1;
cj := NULL;
RESULT := items;
r := 0;
LOOP
r := r + 1;
EXIT WHEN r > items.COUNT  1;
curent := FLOOR (bitposition / 8);
xasraw := UTL_RAW.SUBSTR (permutation, curent + 1, 3);
MASK :=
CASE bitposition MOD 8
WHEN 0
THEN 'FFFFFF'
WHEN 1
THEN '7FFFFF'
WHEN 2
THEN '3FFFFF'
WHEN 3
THEN '1FFFFF'
WHEN 4
THEN '0FFFFF'
WHEN 5
THEN '07FFFF'
WHEN 6
THEN '03FFFF'
WHEN 7
THEN '01FFFF'
END;
xasraw := UTL_RAW.bit_and (xasraw, MASK);
x := UTL_RAW.cast_to_binary_integer (xasraw);
s := FLOOR (x / POWER (2, (24  bitsize  (bitposition MOD 8))));
IF cj IS NULL
THEN
cj := auxvarray (s + 1);
ELSE
cj.EXTEND ();
cj (cj.COUNT) := s + 1;
END IF;
bitposition := bitposition + bitsize;
samesizeindex := samesizeindex + 1;
IF (samesizeindex > samesizecount)
THEN
BEGIN
samesizeindex := 1;
samesizecount := samesizecount * 2;
bitsize := bitsize + 1;
END;
END IF;
END LOOP;
r := cj.COUNT () + 1;
LOOP
r := r  1;
EXIT WHEN r <= 0;
aux := items (cj (r));
RESULT (cj (r)) := RESULT (RESULT.COUNT  r + 1);
RESULT (RESULT.COUNT  r + 1) := aux;
END LOOP;
Note that one can also implement these algorithms in the programming language of choice. We implemented it in PL/SQL in order to show that it is possible to implement them in a procedural language.
Conclusions
The article makes an attempt to present some methods that should be able to solve the often appearing problem of ordering items in a relational database. We look forward to hearing your opinion about these procedures. We also encourage anyone who might have other/better approaches to share them with us all.