Add your own alternative version
Stats
651.3K views 17.2K downloads 195 bookmarked
Posted
13 May 2008

Comments and Discussions




Simple, fast and worth reading as well.





This question is perhaps misplaced. If so, please feel free to ignore.
Your library works a treat, but I'm using it instead of a calculation. Thing is, I don't know how the calculation should go. Here is what I need:
Data: a set of elements 1..n, each with a number of endpoint 1..q
COUNT all possible variations of all the elements, where the elements together can be ordered with their end point.
Example: 2 element, 1 with 2 end points, and 1 with 3 end points (AB and DEF)
These can be ordered as follows
AB
BA
DEF
DFE
...
AB DEF
AB DFE
AB EDF
....
BA DEF
BA DFE
BA EDF
....
DEF AB
DEF BA
DFE AB
etc.
I use your library to first count all variations, ignoring the end points. For example, with 3 elements, 1 with 3 enpoint, and 2 with 2 endpoints:
2
3
2, 2
2, 3
3, 2
2, 3, 2
2, 2, 3
3, 2, 2
etc....
I then, for every combination above, calculate the product of all factorials and add that to the total.
I can optimize this somewhat by using combinations instead of variations, but with a large number of elements, iterating becomes an issue. Parallelism could also help a bit, but still.....
Hence I would like to calculate the total number of possibilities, without having to cycle through the elements. But I can't figure it out. Anybody cares to try and answer this question?
Many thanks.
modified 23Apr13 7:46am.





Ok, if I understand the problem. You are looking for variations of all sizes. Since variations need to have a lower index (k), you want to sum up the permutation sets for each variations of 0 < k <= n.
Let's call the set of all sets X. Then the following pseudocode would provide your result:
BigInteger sumAll = 0;
for(int k = 1; k <= n; ++k) {
BigInteger productWithinVariation = 1;
foreach(var variation in Variations(X choose k)) {
foreach(var element in variation) {
productWithinVariation *= Permutations(element).Count;
}
}
sumAll += productWithinVariation;
}
The case where k = 1 and k = n can be expressed fairly easily, however when 1 < k < n, I don't see any way but to execute the variations.
I think there may be some optimizations that could be made at the lower levels, but none that would remove the need to iterate through all the variations.
Let me know if my pseudocode was not correct to your problem.





This article has been useful in my project involving genetic. Thks!





For example, assume I want to find out all the ways in which I can use a set of, say 5 different characters, to pad a string with 8 characters.
This is a case of choosing any character 8 times from the pool of 5 to pad the other string with. Obviously, I can choose to use only on particular character for all the padding so replacement is allowed.






I understand the concept of generating large differentiated sets of numbers/strings and the different ways of doing so. Combinations vs Permutations etc. What I don't understand is what the test program does.
How does entering two strings and getting a third as an answer result from permutations? I'm completely lost here. What does adding two string versions of numbers have to do with permutations. Please someone with patience, explain it to me very slowly. In the test program what are the numbers on the right supposed to represent? And the number in the drop down supposedly shows the number of "solutions" to the problem. What is the bloody problem Sorry I'm new to all this, so please take the time to explain it to me...Great Kudos will be given...





The sample is designed to solve simple numerical substitution problems. Similar to a Ceaser Cipher, each letter is a placeholder for a digit. The goal of these problems is to find the digits that, when substituted, will satisfy the equation. For example, A + A = B has four solutions: {A = 1, B = 2}, {A = 2, B = 4}, {A = 3, B = 6} and {A = 4, B = 8}. One method for solving these types of problems is to brute force through every possible variation of each letter and then check the solution.





Good article
I realize this code was developed with .NET 3.5. However, in .NET 4.0 the System.Numerics namespace was added with a type of BigInteger. This value has no upper or lower bounds. However, the type is immutable so you could in theory run into an OutOfMemoryException if the values are not in a small enough scope to be released from memory quickly.
Here is a quote from MSDN on this type  http://msdn.microsoft.com/enus/library/system.numerics.biginteger.aspx[^]:
Quote: The BigInteger type is an immutable type that represents an arbitrarily large integer whose value in theory has no upper or lower bounds. The members of the BigInteger type closely parallel those of other integral types (the Byte, Int16, Int32, Int64, SByte, UInt16, UInt32, and UInt64 types). This type differs from the other integral types in the .NET Framework, which have a range indicated by their MinValue and MaxValue properties.
However, there is also a possible impact on performance which could certainly be a reason to not use this implementation:
Quote: The other numeric types in the .NET Framework are also immutable. However, because the BigInteger type has no upper or lower bounds, its values can grow extremely large and have a measurable impact on performance.
Just thought I'd throw this out there though from an academic perspective
Dave Black, MCPD, MCTS
.NET Solutions Architect/Developer
Black Box Solutions, Inc.





Hello i have the following example code and i want to have the option to stop foreach with some way but when the program begins again to continue foreach where i was save before.
Is it possible this ?
Variations<char> variations = new Variations<char>(inputSet, strsize, GenerateOption.WithRepetition);
foreach (var v in variations)
{
// DO SOME WORK
}
Regards
Michael





Michael,
Sure. All you need to know is that foreach in C# is just a shortcut around using the IEnumerable interface directly. That is, the following:
foreach(var v in variations) {
DoSomething(v);
}
Is just a shortcut for doing:
var variationEnumerator = variations.GetEnumerator();
while(variationEnumerator.MoveNext()) {
var v = variationEnumerator .Current;
DoSomething(v);
}
So, if you want to leave the loop and than continue from where you left off. Simply use GetEnumerator() once and then save the enumerator for use whenever desired.
Hope this helps,
Adrian





First thanks for your answer !!
Can i ask how can i save the enumerator.
My thought is to save to a file every x sec the current value of the foreach loop.
If the user stops the program i have the last value in the file.
Now lets say i run the program again the code searches for the saved file loads the value
and continues from there.
Is this possible ?
Regards
Michael





I had to wait to answer until I could open up the code.
Unfortunately, the enumerator does not implement any form of serialization. Also, looking at the implementation it would not serialize easily. It requires that it is attached to the permutations that it is enumerating and both would have to be serialized and deserialized together.
This was not a consideration during the design. It looks like it would take a bit of work to implement as you required.
Cheers,
Adrian





PEE
+POP
EWW
That's the funniest I've found with one single solution, hope you like it :3






Thanks for an excellent article, just what I needed to learn about Combinations





Good article,well written and after i read it, know exactly what i have to do. thx





Hi Adrian,
I have a very large combinatorics problem, 650_C_6, which yields approximately 103,000,000,000,000 combinations with no repetition. Your code is blindingly fast at producing the combinations.(0.03secs  brilliant, thanks!)
Unfortuately the next step I need to accomplish is a seach of the results to determine largest and smallest average values of a property of the objects in each IList<t> generated. I need to be able to complete the search somewhere in the order of 13 secs in order to feed the result into downstream processes which only have 30 secs to complete.
The IEnumerable nature of the output seems to preclude the use of fast searching such as might be found in a Dictionary or a HashSet. This may just be my lack of knowledge BTW. I have tried to solve this but I keep coming up against the need to iterate over the list in order to find what I am looking for.
My question to you is how I might change your code so as to output ideally a Hashset to enable the use of the O(1) nature of this object.
Many thanks in advance for your help
Regards
Ian Carson





Ian,
Sounds like you've got an interesting problem. The library is designed to have a very small memory footprint and doesn't create a list or hash table of anything. Instead, after returning the enumerator, it uses a small amount of state information to get the next element in the sequence. The combinatorial nature of these sets makes them too large to fit into memory, or even on a disk. So, to answer your specific question there is no way to modify the existing code.
Based on my cursory understanding of your problem, it seems like you'll need to reduce the search space before taking the combinations. For example, if we can identify that element e in the set of size 650 will be in the combination that has the largest average then we can reduce the problem to 649_C_5. Altnately, if we could determine that the highest average will be from object in a subset, say of size 60, of the total set (perhaps the largest individuals) then we can reduce the space to 60_C_6 for the largest.
Sorry I couldn't be of more help.
Cheers, Adrian





Thanks Adrian. I think I can reduce the set by ranking the 650 elements by their average value (actually weighted average) and then take the highest and lowest n elements into combinations.
Regards
Ian





Thanks a lot for this nice package!
On my 3.0 GHz PC, I've measured the following for your Permutation generator:
n loops/minute
4 2,97E+05
5 1,22E+06
6 5,66E+06
7 1,78E+07
8 9,98E+07
9 2,61E+08
10 3,31E+08
11 4,46E+08
12 6,32E+08
Permutations<int>pp = new Permutations<int>(inputSet);
long f = 0;
foreach (IList<int> il in pp)
{
f++;
}
The increase in speed with growing values for n is probably due to the relative decrease in object creations per loop.
I've tried to switch code optimization on and off in my VS C# 10 Express.
But the effect was not clear.
Greetings!
Axel Kemper





Axel,
I made sure to optimize this algorithm as I intended it to be general purpose. My focus was on the GetNext method of the enumerator. I did not focus on the initial setup of the collection or the enumerator. So, I think your conclusion would be correct. In fact, if your outer loop (not shown in your example) wraps the creations of the Permutations object then you will (from memory) be paying for a list copy each time.
Cheers, Adrian





Congrats! Your permutation code outperformed all others I came across. Thank you!





My Anagram app for Windows Phone 7 is using this code (packed it in a Silverlight library)
http://www.windowsphone.com/enUS/apps/69d9ac0207fe48e08086cd9ee1884dd4[^]
Thanks a lot!
I packed the files in a Silverlight for Windows Phone class library (named Combinatorics) and modification I had to do was to
1) make the IMetaCollection interface public so that I could abstract the use of any combinatorics operation based on user selection
2) add CombinatoricsOp class to help with using any combinatorics operation via one entry point
As a bonus there is also a Combinatorics.cd there (class diagram)
You can download the library to use with your WP7 Silverlight apps (or repack source into other library project for other targets) from:
https://skydrive.live.com/redir.aspx?cid=2335bbef59b92c54&resid=2335BBEF59B92C54!1705&parid=2335BBEF59B92C54!386[^]
Please give due credit to original author of Combinatorics code (and myself if you use my additions too).
The code for CombinatoricsOp class I added is below. Just call GetCombinatoricsOp to get a Permutations, Combinations or Variations class instance (all implement IMetaCollection interface) and then use the enumerator (with extra method to get count of items without counting them onebyone) as described in this CodeProject article.
using System;
using System.Collections.Generic;
namespace Facet.Combinatorics
{
public static class CombinatoricsOp {
public enum CombinatoricsOpType
{
Permutations = 0,
PermutationsWithRepetition = 1,
Combinations = 2,
CombinationsWithRepetition = 3,
Variations = 4,
VariationsWithRepetition = 5
}
public static IMetaCollection<string> GetCombinatoricsOp(IList<string> values, CombinatoricsOpType opType, int take){
switch (opType)
{
case CombinatoricsOpType.Permutations:
return new Permutations<string>(values);
case CombinatoricsOpType.PermutationsWithRepetition:
return new Permutations<string>(values,GenerateOption.WithRepetition);
case CombinatoricsOpType.Combinations:
return new Combinations<string>(values, take);
case CombinatoricsOpType.CombinationsWithRepetition:
return new Combinations<string>(values, take, GenerateOption.WithRepetition);
case CombinatoricsOpType.Variations:
return new Variations<string>(values, take);
case CombinatoricsOpType.VariationsWithRepetition:
return new Variations<string>(values, take, GenerateOption.WithRepetition);
default:
throw new NotImplementedException();
}
}
}
}
Computer & Informatics Engineer
Microsoft MVP J# 20042010
Borland "Spirit of Delphi" 2001
QuickTime, QTVR, Delphi VCL,
ActiveX, COM, .NET, Robotics
http://zoomicon.com
http://zoomicon.wordpress.com
modified 2Dec11 11:16am.





George,
Thanks for letting me know that you've used this in a project. I'm glad it came in handy, I'll have to look up the app on my phone.
Cheers,
Adrian





Hi again Adrian,
since I'm building an update for the Anagram app (to make it available to more platforms like Win8, Android, iOS), I've separated the Combinatorics part (your code + my small additions) as a PCL (Portable Class Library) and a Windows Phone 8 library (not really needed because of the PCL, but will look into adding a WP7.x lib too with Visual Studio 2012, since at Visual Studio 2013 you can't create PCLs targeting WP7 at the moment, only WP8)
Microsoft didn't have option there for custom license or CPOL, so I selected MSPL (I sometimes also use MIT license which is very simple) which should be similar. Hope you're ok with it (see License tab at the link below)
the library is at
https://combinatorics.codeplex.com/[^]
I used Mercurial repository option at CodePlex (prefer using TortoiseHg in Windows Explorer and VisualHg plugin for Visual Studio than TortoiseGit and VS2013's Git support which I don't like much)
if you have (or want to make) a Codeplex account I can add you to project admins too if you tell me the user id you use there
Computer & Informatics Engineer
Microsoft MVP J# 20042010
Borland Spirit of Delphi 2001
QuickTime, QTVR, Delphi VCL,
ActiveX, COM, .NET, Robotics
http://zoomicon.com
http://zoomicon.wordpress.com





George,
A PCL version sounds like a good idea. I'm glad this is being put to good use.
My CodePlex account is aakison, I'd be happy to join and maybe even make some enhancements I've been meaning to do.
Cheers, Adrian





Hi again Adrian, glad you approve
added you to the project team on codeplex as 2nd coordinator
Computer & Informatics Engineer
Microsoft MVP J# 20042010
Borland Spirit of Delphi 2001
QuickTime, QTVR, Delphi VCL,
ActiveX, COM, .NET, Robotics
http://zoomicon.com
http://zoomicon.wordpress.com





very thorough treatment, both in math and in code





Hoping I can get a free probability lesson here and hoping I can use this code.
Here is what I need the code to do. I have any number of sets and each set has any number of values. I want to get every "combination" of these sets. For example:
Set A: {A.S1, A.S2}
Set B: {B.S1}
Set C: {C.S1, C.S2, C.S3};
I want to get every combination of Set A with Set B with Set C. The calculation would be 2 * 1 * 3 so I would end up with 6 possible outcomes and the result would be:
1. A.S1 B.S1 C.S1
2. A.S1 B.S1 C.S2
3. A.S1 B.S1 C.S3
4. A.S2 B.S1 C.S1
5. A.S3 B.S1 C.S2
6. A.S3 B.S1 C.S3
Can I do this with this code base?





The specific example that you provide doesn't really amount to a permutation problem. It seems that your solution would just be:
foreach(var a in setA) {
foreach(var b in setB) {
foreach(var c in setC) {
... {a, b, c};
}
}
}
I feel, however, that you may have oversimplified the question and that this answer will fall short.
Cheers,
Adrian





+5 Very useful, and really appreciate the full explanations, and background, and souces included.
thanks, Bill
"In the River of Delights, Panic has not failed me." Jorge Luis Borges





Hi,
I have used this excellent library to generate all combinations of a given set (of all sizes)
input set: a, b, c
output: a, b, c, ab, ac, bc, abc
This is not a problem.
However when the input set is big (say 40 items), then generating everything takes a rather long time.
Is it possible to let this run on more cores (everything now runs one 1 core and I have 8 in my machine)?
Thanks,
Ike





Interesting question, I hadn't thought about this before. When I wrote it I implemented 3 different algorithms and checked each for performance on a single core processor. So, I considered performance important but back then didn't even have a multicore processor.
Unfortunately, the algorithms that are well known for this use the content of the current permutation to assist with the generation of the next permutation. As such, they need to be calculated sequentially. Further, I implemented combinations and variations using the permutation, so this issue applies to all of the items.
If it were to be ported to a multithreaded implementation, there would be issues with getting information back. It would be expensive in terms of space to store all of the calculated values before returning them. So, the question is whether we want all the permutations in order, or just all the permutations. While having the permutations in a prescribed order is nice (and sometimes necessary), I'll assume that the need is not for them to be in order.
So, an idea springs to mind. I'll use permutations against {a, b, c, d, e, f, g, h} as my sample set which would have 8! = 40320 permutations.
First, If I were to take each value out as the first entry and then do the permutations of the remaining 7 I'd have:
a {b, c, d, e, f, g, h}
b {a, c, d, e, f, g, h}
...
h {a, b, c, d, e, f, g}
I could then run each of these 8 permutation problems simultaneously on 8 cores. Overhead should be small so we'd get close to 8X performance.
As a quick double check, I'm getting 8 results of 7! each out, which is 8 * 7! = 8 * 5040 = 40320.
Second, for combinations, there is an underlying permutation of values 1 and 0 to determine membership in a combination. For example a 8 choose 3 combination of {a, b, c, d, e, f, g, h} creates an underlying permutation without repetition on {1, 1, 1, 0, 0, 0, 0, 0}. For this we first compute 8 choose 1 (a smaller set) on the master thread and then spawn off smaller subsets to worker threads. The first subset would combine {a} with {b, c, d, e, f, g, h}, the second would combine {b} with {c, d, e, f, g, h} and so forth. Here, the worker threads would have varying amounts of work so the heavier laden threads would need to spawn off smaller computations again to get some form of load balancing.
The benefit of these mechanisms is that they could be overlaid on top of the current library instead of reimplementing the current library.
Hope this helps...





Hi Adrian,
Thanks for the info.
I ran a quick testprogram and just wanted to return back with the results.
Below you'll find the testcode.
"Normal": is the normal code I wrote (using only 1 core)
"Attempt1": is actually slower than "Normal"
"Attempt2": ran about 2 times faster than "Normal"
"Attempt3": ran about 3 times faster than "Normal"
Best regards,
Ike
var chars = "abcdefghijklmnopqrstuvwx".ToCharArray();
var count = chars.Length + 1;
// normal
for (int i = 0; i < count; ++i)
{
var combinations = new Combinatorics.Combinations<char>(chars, i);
foreach (var combination in combinations)
combination.ToString();
}
// attempt1
for (int j = 0; j < count; ++j)
{
var combinations = new Combinatorics.Combinations<char>(chars, j);
Parallel.ForEach(combinations, combination => combination.ToString());
}
// attempt2
Parallel.For(1, count, delegate(int k)
{
var combinations = new Combinatorics.Combinations<char>(chars, k);
Parallel.ForEach(combinations, combination => combination.ToString());
});
// attempt3
Parallel.For(1, count, delegate(int l)
{
var combinations = new Combinatorics.Combinations<char>(chars, l);
foreach (var combination in combinations)
{
combination.ToString();
}
});





Thanks for this article. I am currently implementing a solution for the Traveling Salesman Problem and while creating the code for the standard algorithm that needs all permutations of all cities a faced that this is not a trivial problem. Your article saved me a lot of time.





For my application I would like the following behaviour, doesn't seem to be able to do it.
Lets say I have {1, 2, 2}
I want all unique variations of this set to be output,
1 2 2
2 1 2
2 2 1
Currently I'm using variations (order important) but each 2 is always treated as an individual token, so am getting
1 2 2
1 2 2 < the 2's above have been swapped, but same variation as above.
2 1 2
2 1 2
etc
Of course I can perform duplicate detection outside of the library call but inside would be much more ideal.
Am I right to think this behaviour isn't supported?
thanks.





What you are looking for is a Permutation, not a Variation. If you pass {1, 2, 2} into permutation with the WithoutRepetitions options, you will get what you want.
int[] ints = { 1, 2, 2 };
Permutations<int> permutations = new Permutations<int>(ints, GenerateOption.WithoutRepetition);
foreach(IList<int> permutation in permutations) {
foreach(int element in permutation) {
}
}
Note that if you're using .NET 4 and the var keyword, the enumerator will not work properly for permutations as there is a bug in the code. I've fixed it and submitted to Code Project, but they haven't posted the changes.





Thanks for the prompt reply and yes I can see that satisfies my example.
But, silly me I provided a bad example
For my set I am retrieving for all lengths 1 to n and so using a loop for each length.
For 1 to n
get all variations for length n
process...
next n
As permuations don't allow a length parameter I'm using variations. I just tested for the last length n and it does give the required resultset as you said it would.
I'm trying to think of a way that uses a double call, one to get the < n subsets and then to permutate those. But variations are a form of that, so not sure it's so easy.
Adrian, is there a special reason permutations don't allow length to be specified as the others do? I suppose the algorithm is simplified without it, but formally speaking should a permutation always be of set length n?
(BTW did get the var error, IList<t> works)
Update:
This is the offending line here I believe in the class description of the Variations class
"The equality of multiple inputs is not considered when generating variations."
So the following
var variations = new Facet.Combinatorics.Variations<int>(new List<int> {1, 2, 2}, 2, Facet.Combinatorics.GenerateOption.WithoutRepetition);
generates
1 2 < 1 with the first 2
1 2 < 1 with the second 2
2 1
2 1
2 2
2 2





Ok, I get your general problem. I think what you're looking for is to first get the combinations of your set with the appropriate lower index. This will get all the possible subsets that you're looking for. In your example, you'd want the combinations without repetition. Then for each of those subsets, generate all of the permutations, again without repetitions.
var combinations = new Combinations<int>(new List<int> {1, 2, 2}, 2, GenerateOption.WithoutRepetition);
foreach(var combination in combinations) {
var permutations = new Permutations<int>(combination, GenerateOption.WithoutRepetition);
foreach(var permutation in permutations) {
}
}
I don't have Visual Studio handy, so the above is just a guess. (Also, please remember the problem with permutations that the CodeProject is not updating. Make sure that the generic enumerator is public and the nongeneric is explicit, and it will work.)
Hope this helps,
Adrian





Thanks for replying Adrian.
Unfortunately that still doesn't work.
The line,
var combinations = new Combinations<int>(new List<int> {1, 2, 2}, 2, GenerateOption.WithoutRepetition);
will generate (1, 2) and (2, 1) twice each, one for each 2 in the set.
The WithRepetition option will allow the 1 to be used twice (1, 1), and WithoutRepetition is suppressing that behaviour but not suppressing the duplicate combinations resulting from duplicate members in the original set.
I think what I want which can be summed as "unique variations where duplicate members exist" just isn't supported. It's not such a big deal and I'm handling the duplicates outside.
I spent a bit of time looking for that particular algorithm and didn't come up with much, I imagine that's because it's not easy to do efficiently (without needing to match against previous results e.g. hash tracking).
modified on Sunday, April 3, 2011 9:53 AM





Very useful and elegantly written.






Hi,
it is possible to let Combinations (WithoutRepetitions) to begin from a known combination set?
So if I have 0,1,2,4,5,6,7,8,9,10 with a lower index of 4
to begin from 0,3,6,9 and continue generation from this point?
thank you so much in advance.
very good job.





This is truly excellent it's just what I have been looking for, I had almost given up on finding a good solution for Variations. Thanks for posting this.





I have an old program with old code that I need to update to C# and here is the answer on how to replace the section of code written long ago by the more mathmatical partner I had then.
Thanks






Very useful algorithm, it's much appreciated!





Excellent article Adrian, this is exactly what I needed!
If I may just make a small suggestion: in Permutations<T> class, instead of...
public virtual IEnumerator GetEnumerator() {
return new Enumerator(this);
}
IEnumerator<IList<T>> IEnumerable<IList<T>>.GetEnumerator() {
return new Enumerator(this);
}
...perhaps you should do something like this...
IEnumerator IEnumerable.GetEnumerator() {
return GetEnumerator();
}
public IEnumerator<IList<T>> GetEnumerator() {
return new Enumerator(this);
}
This would enable the client to, instead of writing...
foreach (IList<T> p in new Permutations<T>(inputSet)) {
}
...simply write...
foreach (var p in new Permutations<T>(inputSet)) {
}
...and would also (slightly) improve the performance (no more casting between object and IList<T> ).







General News Suggestion Question Bug Answer Joke Praise Rant Admin Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

