I am trying to solve a classic Knapsack problem with huge capacity of 30.000.000 and it works well up until 20.000.000 but then it runs out of memory:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
I have tried to divide all values and capacity by 1.000.000 but that generates floats and I don't think that is the correct approach. I have also tried to make the arrays and matrix of type long but that does not help. Perhaps another data-structure? Any pointers welcome...
Code:
public class Knapsack {
public static void main(String[] args) {
int N = Integer.parseInt(args[0]); // number of items
int W = Integer.parseInt(args[1]); // maximum weight of knapsack
int[] profit = new int[N+1];
int[] weight = new int[N+1];
// generate random instance, items 1..N
for (int n = 1; n <= N; n++) {
profit[n] = (int) (Math.random() * 1000000);
weight[n] = (int) (Math.random() * W);
}
// opt[n][w] = max profit of packing items 1..n with weight limit w
// sol[n][w] = does opt solution to pack items 1..n with weight limit w include item n?
int[][] opt = new int[N+1][W+1];
boolean[][] sol = new boolean[N+1][W+1];
for (int n = 1; n <= N; n++) {
for (int w = 1; w <= W; w++) {
// don't take item n
int option1 = opt[n-1][w];
// take item n
int option2 = Integer.MIN_VALUE;
if (weight[n] <= w) option2 = profit[n] + opt[n-1][w-weight[n]];
// select better of two options
opt[n][w] = Math.max(option1, option2);
sol[n][w] = (option2 > option1);
}
}
// determine which items to take
boolean[] take = new boolean[N+1];
for (int n = N, w = W; n > 0; n--) {
if (sol[n][w]) { take[n] = true; w = w - weight[n]; }
else { take[n] = false; }
}
// print results
System.out.println("item" + "\t" + "profit" + "\t" + "weight" + "\t" + "take");
for (int n = 1; n <= N; n++) {
System.out.println(n + "\t" + profit[n] + "\t" + weight[n] + "\t" + take[n]);
}
//Copyright © 2000–2011, Robert Sedgewick and Kevin Wayne. Last updated: Wed Feb 9 //09:20:16 EST 2011.
}
Here are a couple of tricks I've used for things like that that.
First, a variant of a sparse matrix. It's not really sparse, but instead of assuming that "non-stored entries" are zero, you assume they're the same as the entry before. This can work in either direction (in the direction of the capacity or in the direction of the items), afaik not (easily) in both directions at the same time. Good trick, but doesn't defeat instances that are huge in both directions.
Secondly, a combination of Dynamic Programming and Branch & Bound. First, use DP with only the "last two rows". That gives you the value of the optimal solution. Then use Branch & Bound to find the subset of items that corresponds to the optimal solution. Sort by
value/weight
, apply the relaxationvalue[next_item] * (capacity_left / weight[next_item])
to bound with. Knowing the optimal value ahead of time makes pruning very effective.The "last two rows" refers to the "previous row" (a slice of the tableau that has the solutions for all items up to
i
) and the "current row" (that you're filling right now). it could look something like this, for example: (this is C# btw, but should be easy to port)Break your for loops down into method calls.
This will have the effect of making the local variables GC'able once the method itself has completed.
So instead of nested for loops within the same main method call a method with the same functionality, which then calls a second method and you are effectively breaking the code up into small packets of local variables which can be collected when out of scope.
Use a recursive method to solve the problem. see http://penguin.ewu.edu/~trolfe/Knapsack01/Knapsack01.html for further information.
Hope it will be of help.