Generate and handle 3 billion elements matrix

2019-07-30 04:33发布

问题:

I need to generate the complete set of combinations obtained when choosing any 8 numbers from a vector of 25 elements. This can be done by using the Combinator function but is very slow for very large matrix and is becoming RAM consuming. Can anyone suggest a clever way to generate this?

Example: using "sample 3 from vector of 4" would yield the following result (H=3 and L=4):

1 1 1
1 1 2
1 1 3
1 1 4
1 2 2
1 2 3
1 2 4
1 3 3
1 3 4
1 4 4
2 2 2
2 2 3
2 2 4
2 3 3
2 3 4
2 4 4
3 3 3
3 3 4
3 4 4
4 4 4

The following code is what I am using now to generate the matrix.

livello = 10:10:250;
H = 5;
L = length(livello);
a = combnk(1:H+L-1, H);
b = cumsum([a(:,1)  diff(a,[],2) - 1],2);
PianoSperimentale = livello(b);

Unfortunately, with more complex cases such as L=74, H=8 and livello = 10:10:750 the elements number of the PianoSperimentale matrix rise quickly following the expression (L+H-1)!/(H!(L-1)!). As said before, the elements of the PianoSperimentale are roughly 3*10^10. Is there any clever way to generate and handle this huge matrix other than buy a computer with 256Gb of Ram :)?

EDIT: Complete Code

clear; close all; clc;

s = RandStream('mt19937ar','Seed',1); RandStream.setGlobalStream(s);

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%                       VARIABLES DECLARATION                           %%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

H = 5;

lover_bound = 10; upper_bound = 250; steps = 20;
livello = lover_bound:steps:upper_bound;

PesoTarget = 250;
LimiteInferiore = PesoTarget - 9;

IncertezzaCella = 0.5 * 10^(-6);

%SIMULATIONS VARIABLES
Alfa = 0.123;
NumeroProve = 3000;

% OBJECTIVE FUNCTION COST COEFFICIENTS
C_u = 0.03; c_p = 0.6; c_f = 734; c_l = 3.2;


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%                         GENERATE SOLUTIONS                            %%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

L = length(livello); livello = sort(livello,'descend');

a = combnk(1:H+L-1, H);
b = cumsum([a(:,1)  diff(a,[],2) - 1],2);
PianoSperimentale = livello(b);
NumeroEsperimenti = size(PianoSperimentale,1);

MatriceCombinazioni = combn([0 1],H);

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%                          ADD UNCERTAINTY                              %%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

PesoCestelli = randn(NumeroEsperimenti,H)*Alfa;
PesoCestelli = (1+PesoCestelli).*PianoSperimentale;
idx = PesoCestelli<0;
random = randn(sum(idx(:)), 1);
PesoCestelli(idx) = random.*((1+Alfa).*PianoSperimentale(idx));

%ADD ERROR (ERROR = IncertezzaCella)
Incertezza = randn(NumeroEsperimenti,H)*IncertezzaCella;
PesoIncertezza = abs((1+Incertezza).*PesoCestelli);

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%                             SIMULATION                                %%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


for i = 1:NumeroProve
    W = PesoCestelli * MatriceCombinazioni';
    W_Inc = PesoIncertezza * MatriceCombinazioni';

    [~,comb] = min(abs(W_Inc-PesoTarget),[],2);

    W_migl = W(sub2ind(size(W),1:size(W,1),comb'))';

    W_tot_migl(:,i) = W_migl;


    PesoCestelli_2 = randn(NumeroEsperimenti,H)*Alfa;
    PesoCestelli_2 = (1+PesoCestelli_2).*PianoSperimentale;
    idx = PesoCestelli_2<0;
    random = randn(sum(idx(:)), 1);
    PesoCestelli_2(idx) = random.*((1+Alfa).*PianoSperimentale(idx));

    %ADD ERROR (ERROR = IncertezzaCella)
    Incertezza = randn(NumeroEsperimenti,H)*IncertezzaCella;
    PesoIncertezza_2 = abs((1+Incertezza).*PesoCestelli_2);


    PesoCestelli(MatriceCombinazioni(comb,:)~=0) = ...
        PesoCestelli_2(MatriceCombinazioni(comb,:)~=0); 
    PesoIncertezza(MatriceCombinazioni(comb,:)~=0) = ...
        PesoIncertezza_2(MatriceCombinazioni(comb,:)~=0); 

    clear PesoIncertezza2
end

n_b = sum(logical(W_tot_migl>=LimiteInferiore),2);

Media_b = sum(W_tot_migl.*(W_tot_migl>LimiteInferiore),2)./...
    sum(W_tot_migl>LimiteInferiore,2);
ValoreFunzioneObiettivo = C_u .* Media_b + (NumeroProve./n_b) .* c_p + ...
    (c_f./n_b) + ((NumeroProve - n_b)./n_b) .* c_l;

The output is ObjectiveFunctionValue, mean and stddev of W_tot_migl and the PianoSperimentale row that gave the minimum of ObjectiveFunctionValue.