Introduction
In the coming years, multi-core processors will become mainstream. It would be nice, if we have dual core CPUs, to use each core to compute 50% of the total permutations. However, next_permutation
in the STL's algorithms does not let us do so. Why? The reason is that next_permutation
relies on the current permutation to generate the next one.
Let us consider a situation where we have a total of 200,000,000 permutations to find. If we can find the 100,000,000th permutation, we can use two threads calling next_permutation()
100,000,000 times. One thread will start from the 0th permutation and the other will start from the 100,000,000th permutation. However, to find the 100,000,000th permutation, you have to call next_permutation()
100,000,000 times! Hence, I have created an algorithm which lets us compute the 100,000,000th permutation.
This method is called "finding the permutation by index" or "positional permutations." I later found out that this method had been invented quite some time ago and its actual name is "factoradic." This method is to be used in conjunction with next_permutation()
to spread the workload among different cores or SMP processors. Of course, factoradic can be used to find all 200,000,000 permutations, but since next_permutation()
is much faster than the factoradic implementation, we are only using factoradic to find the nth permutation. From the nth permutation onwards, we are going to use next_permutation()
all the way.
The Technique of Finding Permutations
Let us have a quick run-through of how to find permutations before I show you how to find a permutation given its index. I will show you an example of finding the permutation of 0, 1, 2, 3. In the first column, write down these numbers:
0,
1,
2,
3,
For the first column, each number is going to have 3! rows because there are 3 other unwritten columns.
0,
0,
0,
0,
0,
0,
1,
1,
1,
1,
1,
1,
2,
2,
2,
2,
2,
2,
3,
3,
3,
3,
3,
3,
For the first column, let us find the permutations of the number 0. We will ignore 1, 2 and 3 for the time-being. For the second column, each number is going to have 2! rows because there are two other unwritten columns.
0,1,
0,1,
0,2,
0,2,
0,3,
0,3,
For the last two columns, write down the numbers that are not taken by the 1st and 2nd column in ascending order, and the next row in descending order.
0,1,2,3
0,1,3,2
0,2,1,3
0,2,3,1
0,3,1,2
0,3,2,1
Using the same method, you can find permutations for the other three numbers. That is: 1, 2 and 3 in the first column. All of the permutations are shown below:
0,1,2,3
0,1,3,2
0,2,1,3
0,2,3,1
0,3,1,2
0,3,2,1
1,0,2,3
1,0,3,2
1,2,0,3
1,2,3,0
1,3,0,2
1,3,2,0
2,0,1,3
2,0,3,1
2,1,0,3
2,1,3,0
2,3,0,1
2,3,1,0
3,0,1,2
3,0,2,1
3,1,0,2
3,1,2,0
3,2,0,1
3,2,1,0
The Technique of Finding a Permutation Given the Index
Note that the total number of permutations is a result of taking the factorial of the total number of elements. To explain the algorithm, let me take you through the steps of an example. Given the index 79, we need to find the permutation from a set of five elements: 0, 1, 2, 3 and 4. The total number of permutations is 120. Since this technique is zero-based, the last index would be the 119th. Let us begin by finding the element in column 1, where columns are one-based.
0,1,2,3,4 = factorial(4) * 0 = 0th Permutation
0,x,x,x,x
.
.
0,x,x,x,x
1,0,2,3,4 = factorial(4) * 1 = 24th Permutation
1,x,x,x,x
.
.
1,x,x,x,x
2,0,1,3,4 = factorial(4) * 2 = 48th Permutation
2,x,x,x,x
.
.
2,x,x,x,x
3,0,1,2,4 = factorial(4) * 3 = 72th Permutation
3,x,x,x,x
.
.
3,x,x,x,x
4,0,1,2,3 = factorial(4) * 4 = 96th Permutation
4,x,x,x,x
.
.
4,3,2,1,0 = Last permutation(119th)
x,x,x,x,x = factorial(4) * 5 = 120th Permutation(out of bounds)
To find the 1st element, we need to find in the positional permutations which one 79 is greater than, smaller than or equal to for the next positional permutation. As we can see, 79 is sandwiched between the 72nd and 96th; the 5th element is 3. As for why factorial(4), it is because we are finding one element out of five elements now. Subtract 1 from 5 elements and you get 4. Let me give an example to clarify.
To get from 0, 1, 2, 3, 4 (0th permutation) to 1, 0, 2, 3, 4 (24th permutation), columns 2 to 5 would have to permute 4!-1 times because there are 4 columns to permute. Before we proceed to find the 2nd element, we need to subtract 72 from 79, which is 7, before proceeding to the next iteration. Let us now find the 2nd element from the remaining elements (0, 1, 2, 4):
0,1,2,4 = factorial(3) * 0 = 0th Permutation
0,x,x,x
.
.
0,x,x,x
1,0,2,4 = factorial(3) * 1 = 6th Permutation
1,x,x,x
.
.
1,x,x,x
2,0,1,4 = factorial(3) * 2 = 12th Permutation
2,x,x,x
.
.
2,x,x,x
4,0,1,2 = factorial(3) * 3 = 18th Permutation
4,x,x,x
.
.
4,2,1,0 = Last permutation(23th)
x,x,x,x = factorial(3) * 4 = 24th Permutation(out of bounds)
As we can see, 7 is sandwiched between the 6th and 12th permutations, so the 2nd element is 1. Subtracting 6 from 7 leaves 1. Let the next iteration begin:
0,2,4 = factorial(2) * 0 = 0th Permutation
0,x,x
2,0,4 = factorial(2) * 1 = 2nd Permutation
2,x,x
4,0,2 = factorial(2) * 2 = 4th Permutation
4,2,0 = Last permutation(5th)
x,x,x = factorial(2) * 3 = 6th Permutation(out of bounds)
As we can see, 1 is sandwiched between the 0th and 2nd permutations, so the 3rd element is 0. There is nothing to subtract in this iteration.
2,4 = factorial(1) * 0 = 0th Permutation
4,2 = factorial(1) * 1 = 1st Permutation
As we can see, 1 is equal to the 1st, so the 4th element is 4. There is no need for the next iteration because only 2 is left, so the 5th element is 2. The generated permutation is 3, 1, 0, 4, 2 for index 79. You can verify this by calling next_permutation()
78 times. It is 78 times because the first permutation 0, 1, 2, 3, 4 is not generated by next_permutation()
, but rather is fed into next_permutation()
. As you may have noticed, the last permutation is always in descending order. Can we reverse-engineer 3, 1, 0, 4, 2 and get the index 79? Sure, just follow the steps below:
Factorial(4) * IndexOf(3) in {0,1,2,3,4} = 4! * 3 = 72
Remove 3 from 0, 1, 2, 3, 4. It turns into 0, 1, 2, 4.
Factorial(3) * IndexOf(1) in {0,1,2,4} = 3! * 1 = 6
Remove 1 from 0, 1, 2, 4. It turns into 0, 2, 4.
Factorial(2) * IndexOf(0) in {0,2,4} = 2! * 0 = 0
Remove 0 from 0, 2, 4. It turns into 2, 4.
Factorial(1) * IndexOf(4) in {2,4} = 1! * 1 = 1
Remove 4 from 2, 4. It turns into 2.
Factorial(0) * IndexOf(2) in {2} = 1! * 0 = 0
Add up all the calculations and you will get 72 + 6 + 0 + 1 + 0 = 79.
Code Examples for Finding Permutations by Index (Factoradic)
#include <iostream>
#include "Factorial.h"
#include "FindPermByIdx.h"
void main()
{
const int SIZE = 3;
CFindPermByIdx<BigInteger> findperm;
BigInteger biResults;
vector<unsigned int> vec(SIZE);
vec[0]=0;
vec[1]=1;
vec[2]=2;
int cnt = 0;
bool bContinue=true;
{
cout<<cnt<<")";
++cnt;
for( int j=0; j<SIZE; ++j )
{
cout<< vec[j] << ",";
}
cout<<endl;
}
do
{
bContinue = findperm.FindPerm( SIZE, cnt, vec );
if( bContinue )
{
cout<<cnt<<")";
for( int j=0; j<SIZE; ++j )
{
cout<< vec[j] << ",";
}
}
cout<<endl;
++cnt;
}
while( bContinue );
CFactorial<BigInteger>::Destroy();
}
You may have noticed that CFindPermByIdx
operates on numbers only. The contents of the first permutation vector always start from zero and are in running numbers. Unlike next_permutation()
, which operates on objects by comparing, CFindPermByIdx
operates on zero-based running numbers. What you will get from its results are indices. For example, if you get a result of {0, 2, 1}, it means that the 1st element is index 0 of the array (of objects) that you originally wanted to permute. The 2nd element refers to index 2 of the array and the 3rd element is index 1 of the array.
Permuting numbers is much easier and faster than permuting objects. CFindPermByIdx
uses a big integer class because this algorithm uses factorials to find the permutation and factorials are usually very large numbers. Since it is a templated class and you know the factorials used are not going to exceed 64 bits, you can use the __int64
type instead. The is the output of the above example code:
0)0,1,2,
1)0,2,1,
2)1,0,2,
3)1,2,0,
4)2,0,1,
5)2,1,0,
Benchmark Between next_permutation() and CFindPermByIdx
Below are the timing results of next_permutation()
and CFindPermByIdx
finding the permutations of 8 elements, which is 40320 permutations in total.
time taken for next_permutation: 1 milliseconds
time taken for CFindPermByIdx: 3915 milliseconds
The source code for the benchmark is included for download.
Code Examples for Finding Index from Permutation
#include <iostream>
#include "Factorial.h"
#include "FindPermByIdx.h"
void main()
{
const int SIZE = 4;
CFactorial<BigInteger>::Init();
CFindPermByIdx<BigInteger> findperm;
BigInteger biResults;
vector<unsigned int> vec(SIZE);
vec[0]=0;
vec[1]=1;
vec[2]=2;
vec[3]=3;
do
{
findperm.FindIndex( 4, biResults, vec );
cout<<biResults<<")";
for( int j=0; j<SIZE; ++j )
{
cout<< vec[j] << ",";
}
cout<<endl;
}
while( next_permutation( vec.begin(), vec.end() ) );
CFactorial<BigInteger>::Destroy();
}
This is the output of the above example code:
0.)0,1,2,3,
1.)0,1,3,2,
2.)0,2,1,3,
3.)0,2,3,1,
4.)0,3,1,2,
5.)0,3,2,1,
6.)1,0,2,3,
7.)1,0,3,2,
8.)1,2,0,3,
9.)1,2,3,0,
10.)1,3,0,2,
11.)1,3,2,0,
12.)2,0,1,3,
13.)2,0,3,1,
14.)2,1,0,3,
15.)2,1,3,0,
16.)2,3,0,1,
17.)2,3,1,0,
18.)3,0,1,2,
19.)3,0,2,1,
20.)3,1,0,2,
21.)3,1,2,0,
22.)3,2,0,1,
23.)3,2,1,0,
A Benchmark Program for Your PC
I have included a demo program called TimePermutations
for you to benchmark your PC. It is nice to know how much time has been reduced if you have a dual-core or quad-core PC. The program does not store the permutations anywhere and discards them after every computation. If you have a uniprocessor PC, then you do not have much use for this benchmark program. Below are the screenshots of the program.

Conclusion
In this article, I introduced an algorithm called "finding permutations by index" or "factoradic" to let us find the nth permutation. It continues from there with next_permutation()
to split and speed up the work on many processors. I look forward to your feedback and hope you like my article!
References
History
- 7 April, 2009 -- Updated benchmark source and application
- 12 March 2009 - Changed the source code to use C++ Big Integer Library written by Matt McCutchen, so as to remove the need to download and build Crypto++, and also to reduce download sizes
- 19 November, 2007 -- Original version posted