Memory efficient power set algorithm

2019-01-12 01:12发布

问题:

Trying to calculate all the subsets (power set) of the 9-letter string 'ABCDEFGHI'.

Using standard recursive methods, my machine hits out of memory (1GB) error before completing. I have no more physical memory.

How can this be done better? Language is no issue and results sent to the standard output is fine as well - it does not need to be held all in memory before outputting.

回答1:

There is a trivial bijective mapping from the power set of X = {A,B,C,D,E,F,G,H,I} to the set of numbers between 0 and 2^|X| = 2^9:

Ø maps to 000000000 (base 2)

{A} maps to 100000000 (base 2)

{B} maps to 010000000 (base 2)

{C} maps to 001000000 (base 2)

...

{I} maps to 000000001 (base 2)

{A,B} maps to 110000000 (base 2)

{A,C} maps to 101000000 (base 2)

...

{A,B,C,D,E,F,G,H,I} maps to 111111111 (base 2)

You can use this observation to create the power set like this (pseudo-code):

Set powerset = new Set();
for(int i between 0 and 2^9)
{
  Set subset = new Set();
  for each enabled bit in i add the corresponding letter to subset
  add subset to powerset
}

In this way you avoid any recursion (and, depending on what you need the powerset for, you may even be able to "generate" the powerset without allocating many data structures - for example, if you just need to print out the power set).



回答2:

I would use divide and conquer for this:

Set powerSet(Set set) {
  return merge(powerSet(Set leftHalf), powerSet(Set rightHalf));
}

merge(Set leftHalf, Set rightHalf) {
  return union(leftHalf, rightHalf, allPairwiseCombinations(leftHalf, rightHalf));
}

That way, you immediately see that the number of solutions is 2^|originalSet| - that's why it is called the "power set". In your case, this would be 2^9, so there should not be an out of memory error on a 1GB machine. I guess you have some error in your algorithm.



回答3:

a little scheme solution

(define (power_set_iter set)
  (let loop ((res '(())) 
             (s    set ))
    (if (empty? s)
        res
        (loop (append (map (lambda (i) 
                             (cons (car s) i)) 
                           res) 
                      res) 
              (cdr s)))))

Or in R5RS Scheme, more space efficient version

(define (pset s)
  (do ((r '(()))
       (s s (cdr s)))
      ((null? s) r)
    (for-each 
      (lambda (i) 
        (set! r (cons (cons (car s) i) 
                      r)))
      (reverse r))))


回答4:

I verified that this worked well:

#include <iostream>

void print_combination(char* str, char* buffer, int len, int num, int pos)
{
  if(num == 0)
  {
    std::cout << buffer << std::endl;
    return;
  }

  for(int i = pos; i < len - num + 1; ++i)
  {
    buffer[num - 1] = str[i];
    print_combination(str, buffer, len, num - 1, i + 1);
  }
}

int main()
{
  char str[] = "ABCDEFGHI";
  char buffer[10];
  for(auto i = 1u; i <= sizeof(str); ++i)
  {
    buffer[i] = '\0';
    print_combination(str, buffer, 9, i, 0);
  }
}


回答5:

How about this elegant solution? Extend the code which generates nCr to iterate for r=1 till n!

#include<iostream>
using namespace std;

void printArray(int arr[],int s,int n)
{
    cout<<endl;
    for(int i=s ; i<=n-1 ; i++)
        cout<<arr[i]<<" ";
}

void combinateUtil(int arr[],int n,int i,int temp[],int r,int index)
{
    // What if n=5 and r=5, then we have to just print it and "return"!
    // Thus, have this base case first!
    if(index==r)
    {
        printArray(temp,0,r);
        return;
    }  

    // If i exceeds n, then there is no poin in recurring! Thus, return
    if(i>=n)
        return;


    temp[index]=arr[i];
    combinateUtil(arr,n,i+1,temp,r,index+1);
    combinateUtil(arr,n,i+1,temp,r,index);

}

void printCombinations(int arr[],int n)
{
    for(int r=1 ; r<=n ; r++)
    {
        int *temp = new int[r];
        combinateUtil(arr,n,0,temp,r,0);
    }
}



int main()
{
    int arr[] = {1,2,3,4,5};
    int n = sizeof(arr)/sizeof(arr[0]);

    printCombinations(arr,n);

    cin.get();
    cin.get();
    return 0;
}

The Output :

1 
2 
3 
4 
5 
1 2 
1 3 
1 4 
1 5 
2 3 
2 4 
2 5 
3 4 
3 5 
4 5 
1 2 3 
1 2 4 
1 2 5 
1 3 4 
1 3 5 
1 4 5 
2 3 4 
2 3 5 
2 4 5 
3 4 5 
1 2 3 4 
1 2 3 5 
1 2 4 5 
1 3 4 5 
2 3 4 5 
1 2 3 4 5