Improving MATLAB Matrix Construction Code : Or, co

2020-02-04 21:27发布

问题:

I wrote a program in order to construct a portion of a 3-Band Wavelet Transform Matrix. However, given that the size of the matrix is 3^9 X 3^10, it takes a while for MATLAB to finish constructing it. Therefore, I was wondering if there was a way to improve the code I am using to make it run faster. I am using n=10 when running the code.

B=zeros(3^(n-1),3^n);
v=[-0.117377016134830 0.54433105395181 -0.0187057473531300 -0.699119564792890 -0.136082763487960 0.426954037816980 ];

for j=1:3^(n-1)-1 
    for k=1:3^n;
        if k>6+3*(j-1) || k<=3*(j-1)
            B(j,k)=0;
        else 
            B(j,k)=v(k-3*(j-1));
        end                
    end
end
j=3^(n-1);
    for k=1:3^n
        if k<=3
            B(j,k)=v(k+3);
        elseif k<=3^n-3
            B(j,k)=0;
        else 
            B(j,k)=v(k-3*(j-1));
        end
    end

W=B;

回答1:

How to vectorize without knowing how to vectorize:

First, I'll only discuss vectorizing the first double loop, you can follow the same logic for the second loop.

I tried to show a thought process from scratch, so though the final answer is just 2 lines long, it is worthwhile to see how a beginner can try to obtain it.

First, I recommend "massaging" the code in simple cases, to get a feel for it. For example, I've used n=3 and v=1:6 and run the first loop once, this is how B looks like:

[N M]=size(B)
N =
     9
M =
    27

imagesc(B); 

So you can see we get a staircase like matrix, which is pretty regular! All we need is to assign the right matrix index to the right value of v and that's it.

There are many ways to achieve that, some more elegant than others. One of the simplest ways is to use the function find:

pos=[find(B==v(1)),find(B==v(2)),find(B==v(3)),...
     find(B==v(4)),find(B==v(5)),find(B==v(6))]

pos =
     1    10    19    28    37    46
    29    38    47    56    65    74
    57    66    75    84    93   102
    85    94   103   112   121   130
   113   122   131   140   149   158
   141   150   159   168   177   186
   169   178   187   196   205   214
   197   206   215   224   233   242

The values above are the linear indices of matrix B where values of v were found. Each column represents the linear index of a specific value of v in B. For example, indices [1 29 57 ...] all contain the value v(1), etc... Each row contains v entirely, so, indices [29 38 47 56 65 74] contain v=[v(1) v(2) ... v(6)]. You can notice that for each row the difference between indices is 9, or, each index is separated with a step size N, and there are 6 of them, which is just the length of vector v (also obtained by numel(v)). For the columns, the difference between adjacent elements is 28, or, the step size is M+1.

We shall simply need to assign the values of v in the proper indices according to this logic. one way is to write each "row":

B([1:N:numel(v)*N]+(M+1)*0)=v;
B([1:N:numel(v)*N]+(M+1)*1)=v;
...
B([1:N:numel(v)*N]+(M+1)*(N-2))=v;

But that is impractical for big N-2, so you can do it in a for loop if you really want:

for kk=0:N-2;
     B([1:N:numel(v)*N]+(M+1)*kk)=v;
end

Matlab offers a more efficient way to get all the indices at once using bsxfun (this replaces the for loop), for example:

ind=bsxfun(@plus,1:N:N*numel(v),[0:(M+1):M*(N-1)+1]')

so now we can use ind to assign v into the matrix N-1 times. For that we need to "flatten" ind into a row vector:

ind=reshape(ind.',1,[]);

and concatenate v to itself N-1 times (or make N-1 more copies of v):

vec=repmat(v,[1 N-1]);

finally we get the answer:

B(ind)=vec;

Making a long story short, and writing it all compactly, we get a 2 line solution (given size B is already known: [N M]=size(B)) :


ind=bsxfun(@plus,1:N:N*numel(v),[0:(M+1):M*(N-1)+1]');
B(reshape(ind.',1,[]))=repmat(v,[1 N-1]);

For n=9 the vectorized code is ~850 faster in my machine. (small n will be less significant)

Since the matrix obtained is mostly made up from zeros, you don't need to assign these to a full matrix, instead use a sparse matrix, Here's the full code for that (pretty similar):

N=3^(n-1);
M=3^n;
S=sparse([],[],[],N,M);
ind=bsxfun(@plus,1:N:N*numel(v),[0:(M+1):M*(N-1)+1]');
S(reshape(ind.',1,[]))=repmat(v,[1 N-1]);

For n=10 I could only run the sparse matrix code (out of memory otherwise), and in my machine it took ~6 seconds.

Now try to apply this to the second loop...



回答2:

While your matrix has huge dimensions it's also very "sparse", which means that most of it's elements are zeros. To improve performance you can make use of MATLAB's sparse matrix support by ensuring that you only operate on the non-zero parts of your matrix.

Sparse matrices in MATLAB can be built efficiently by constructing the coordinate form of the sparse matrix. This means that three arrays, defining the row, column and value for each non-zero entry in the matrix must be defined. This means that rather than assigning values via the conventional A(i,j) = x syntax, we would instead append that non-zero entry onto our sparse indexing structure:

row(pos+1) = i;
col(pos+1) = j;
val(pos+1) = x;
% pos is the current position within the sparse indexing arrays!

Once we have the full set of non-zeros in our sparse indexing arrays, we can use the sparse command to build the matrix.

For this problem, we add a maximum of six non-zero entries for each row, allowing us to allocate the sparse indexing arrays in advance. The variable pos keeps track of our current position in the indexing arrays.

rows = 3^(n-1);
cols = 3^(n+0);

% setup the sparse indexing arrays for non-
% zero elements of matrix B
row = zeros(rows*6,1);
col = zeros(rows*6,1);
val = zeros(rows*6,1);
pos = +0;

We can now build the matrix by adding any non-zero entries to the sparse indexing arrays. Since we only care about non-zero entries, we only loop over the non-zero parts of the matrix as well.

I've left the logic for the last row for you to fill in!

for j = 1 : 3^(n-1)
    if (j < 3^(n-1))

% add entries for a general row
    for k = max(1,3*(j-1)+1) : min(3^n,3*(j-1)+6)             
        pos = pos+1;
        row(pos) = j;
        col(pos) = k;
        val(pos) = v(k-3*(j-1));                
    end

    else

% add entries for final row - todo!!

    end
end

Since we don't add six non-zeros for every row, we may have over-allocated the sparse indexing arrays, so we trim them down to the size that's actually used.

% only keep the sparse indexing that we've used
row = row(1:pos);
col = col(1:pos);
val = val(1:pos);

The final matrix can now be built using the sparse command.

% build the actual sparse matrix
B = sparse(row,col,val,rows,cols);

The code can be run by collating the snippets above. For n = 9 we get the following results (for comparison, I've also included results for the bsxfun based approach suggested by natan):

Elapsed time is 2.770617 seconds. (original)
Elapsed time is 0.006305 seconds. (sparse indexing)
Elapsed time is 0.261078 seconds. (bsxfun)

The original code runs out of memory for n = 10, though both sparse approaches are still usable:

Elapsed time is 0.019846 seconds. (sparse indexing)
Elapsed time is 2.133946 seconds. (bsxfun)


回答3:

You can use a cunning way to create a block diagonal matrix like this:

>> v=[-0.117377016134830 0.54433105395181 -0.0187057473531300 ...
          -0.699119564792890 -0.136082763487960 0.426954037816980];
>> lendiff=length(v)-3;
>> B=repmat([v zeros(1,3^n-lendiff)],3^(n-1),1);
>> B=reshape(B',3^(n),3^(n-1)+1);
>> B(:,end-1)=B(:,end-1)+B(:,end);
>> B=B(:,1:end-1)';

Here, lendiff is used to create 3^{n-1} copies of a line with v followed by zeros, that have length 3^n+3, so a matrix of size [3^{n-1} 3^n+3].

That matrix is reshaped into size [3^n 3^{n-1}+1] to create the shifts. The extra column needs to be added to the last and B needs to be transposed.

Should be much faster though.

EDIT

Seeing Darren's solution and realising that reshape works on sparse matrices too, got me to come up with this -- without for loops (un-coded the original solution).

First the values to start with:

>> v=[-0.117377016134830  ...
       0.54433105395181   ...
      -0.0187057473531300 ...
      -0.699119564792890  ...
      -0.136082763487960  ...
       0.426954037816980];    
>> rows = 3^(n-1);                  % same number of rows
>> cols = 3^(n)+3;                  % add 3 cols to implement the shifts    

Then make the matrix with 3 extra columns per row

>> row=(1:rows)'*ones(1,length(v)); % row number where each copy of v is stored'
>> col=ones(rows,1)*(1:length(v));  % place v at the start columns of each row
>> val=ones(rows,1)*v;              % fill in the values of v at those positions
>> B=sparse(row,col,val,rows,cols); % make the matrix B[rows cols+3], but now sparse

Then reshape to implement the shifts (extra row, right number of columns)

>> B=reshape(B',3^(n),rows+1);      % reshape into B[3^n rows+1], shifted v per row'
>> B(1:3,end-1)=B(1:3,end);         % the extra column contains last 3 values of v
>> B=B(:,1:end-1)';                 % delete extra column after copying, transpose

For n=4,5,6,7 this results in cpu times in s:

n    original    new version
4    0.033       0.000
5    0.206       0.000
6    1.906       0.000
7    16.311      0.000

measured by the profiler. For the original version I cannot run n>7 but the new version gives

n    new version
8    0.002
9    0.009
10   0.022
11   0.062
12   0.187
13   0.540
14   1.529
15   4.210

and that is how far my RAM goes :)