14,240,732 members

Generating Subsets

Rate this:
23 Sep 2018CPOL
How to generate subsets

Every once in a while, I am being challenged with problems involving optimal subsets from a given set of objects. It happened again a few weeks ago. The problem was of a knapsack style (finding the most profitable subset of items of bounded total size) and ... of course, I forgot how to tackle it using dynamic programming. But, many years ago, I learned a very simple (to memorise) algorithm which saved the day and I will describe it in this article.

The Algorithm Design Manual, page 453, defines it as binary counting. The idea behind it is the one to one mapping between the numbers 0 and 2n-1 (total number of numbers is 2n) and binary strings of length n (total number of strings is 2n) where bit 1 at the index i means "consider the object at the index i in the original set for the given subset". We also know that the total number of subsets of a set of size n is 2n. We have 2n everywhere, awesome!

Let's see it in action by looking at a simple example, a small set {A, B, C} with n=3. We are not interested in the empty set, thus we will ignore the 0-th iteration, leaving us with a total of 23-1=7 subsets:

 Iteration Binary string Indexes of the objects to consider Subset 1 001 1 {A} 2 010 2 {B} 3 011 1 and 2 {A, B} 4 100 3 {C} 5 101 1 and 3 {A, C} 6 110 2 and 3 {B, C} 7 111 1, 2 and 3 {A, B, C}

I need to stress the fact that it's an exponential (or O(2n)) complexity algorithm. It is very inefficient for big sets. Here is a code example:

```import java.util.LinkedHashSet;
import java.util.Set;

public class AllSubSets {
public static void main(String args[]) {
String[] setValues = new String[] { "A", "B", "C", "D", "E",
"F", "G", "H", "I", "J" };

long limit = 1 << setValues.length; // this is 2^length

for (long l = 1; l < limit; l++) {
for (int i = 0; i < setValues.length; i++) {
if ((l & (1 << i)) > 0) {
}
}
subSet.forEach(System.out::print);
System.out.println();
}
}
}```

and the result:

```A
B
AB
C
AC
BC
ABC
D
... (trimmed due to the size) ...
CDEFGHIJ
ACDEFGHIJ
BCDEFGHIJ
ABCDEFGHIJ```

A careful reader will almost immediately reply that this version is limited to sets with maximum 64 objects (with Java 8, long is limited to 64 bits and can be treated as unsigned). BigInteger, which supports bitwise operations, addresses this limitation. And yes, the extra subset construction loop makes it an O(n2n) algorithm, it's still exponential though.

However, as it was mentioned earlier, this is not an efficient algorithm. Anything bigger than 64 objects will produce more than 264 subsets and this is a big number. Even 232 is pretty decent in size, so don't use this algorithm for big sets.

Share

 Software Developer (Senior) Standard Chartered United Kingdom
My name is Ruslan Ciurca. Currently I am Software Engineer at Standard Chartered/.

 First Prev Next
 You can use duals to speed up your subset generation! losvce24-Sep-18 12:14 losvce 24-Sep-18 12:14
 Re: You can use duals to speed up your subset generation! rtybase26-Sep-18 0:13 rtybase 26-Sep-18 0:13
 Re: You can use duals to speed up your subset generation! losvce26-Sep-18 4:58 losvce 26-Sep-18 4:58
 Re: You can use duals to speed up your subset generation! rtybase26-Sep-18 21:12 rtybase 26-Sep-18 21:12
 Last Visit: 19-Jul-19 15:00     Last Update: 19-Jul-19 15:00 Refresh 1