MATLAB: Block matrix multiplying without loops

2019-05-20 04:59发布

问题:

I have a block matrix [A B C...] and a matrix D (all 2-dimensional). D has dimensions y-by-y, and A, B, C, etc are each z-by-y. Basically, what I want to compute is the matrix [D*(A'); D*(B'); D*(C');...], where X' refers to the transpose of X. However, I want to accomplish this without loops for speed considerations.

I have been playing with the reshape command for several hours now, and I know how to use it in other cases, but this use case is different from the other ones and I cannot figure it out. I also would like to avoid using multi-dimensional matrices if at all possible.

回答1:

Honestly, a loop is probably the best way to do it. In my image-processing work I found a well-written loop that takes advantage of Matlab's JIT compiler is often faster than all the extra overhead of manipulating the data to be able to use a vectorised operation. A loop like this:

[m n] = size(A);
T = zeros(m, n);
AT = A';
for ii=1:m:n
    T(:, ii:ii+m-1) = D * AT(ii:ii+m-1, :);
end

contains only built-in operators and the bare minimum of copying, and given the JIT is going to be hard to beat. Even if you want to factor in interpreter overhead it's still only a single statement with no functions to consider.

The "loop-free" version with extra faffing around and memory copying, is to split the matrix and iterate over the blocks with a hidden loop:

blksize = size(D, 1);
blkcnt = size(A, 2) / blksize;
blocks = mat2cell(A, blksize, repmat(blksize,1,blkcnt));
blocks = cellfun(@(x) D*x', blocks, 'UniformOutput', false);
T = cell2mat(blocks);

Of course, if you have access to the Image Processing Toolbox, you can also cheat horribly:

T = blockproc(A, size(D), @(x) D*x.data');


回答2:

Prospective approach & Solution Code

Given:

  • M is the block matrix [A B C...], where each A, B, C etc. are of size z x y. Let the number of such matrices be num_mat for easy reference later on. If those matrices are concatenated along the columns, then M would be of size z x num_mat*y.
  • D is the matrix to be multiplied with each of those matrices A, B, C etc. and is of size y x y.

Now, as stated in the problem, the output you are after is [D*(A'); D*(B'); D*(C');...], i.e. the multiplication results being concatenated along the rows. If you are okay with those multiplication results to be concatenated along the columns instead i.e. [D*(A') D*(B') D*(C') ...], you can achieve the same with some reshaping and then performing the matrix multiplications for the entire M with D and thus have a vectorized no-loop approach. Thus, to get such a matrix multiplication result, you can do -

mults = D*reshape(permute(reshape(M,z,y,[]),[2 1 3]),y,[]);

But, if you HAVE to get an output with the multiplication results being concatenated along the rows, you need to do some more reshaping like so -

out = reshape(permute(reshape(mults,y,z,[]),[1 3 2]),[],z);

Benchmarking

This section covers benchmarking codes comparing the proposed vectorized approach against a naive JIT powered loopy approach to get the desired output. As discussed earlier, depending on how the output array must hold the multiplication results, you can have two cases.

Case I: Multiplication results concatenated along the columns

%// Define size paramters and then define random inputs with those
z = 500; y = 500; num_mat = 500;
M = rand(z,num_mat*y);
D = rand(y,y);

%// Warm up tic/toc.
for k = 1:100000
    tic(); elapsed = toc();
end

disp('---------------------------- With loopy approach')
tic
out1 = zeros(z,y*num_mat);
for k1 = 1:y:y*num_mat
    out1(:,k1:k1+y-1) = D*M(:,k1:k1+y-1).'; %//'
end
toc, clear out1 k1

disp('---------------------------- With proposed approach')
tic
mults = D*reshape(permute(reshape(M,z,y,[]),[2 1 3]),y,[]);
toc

Case II: Multiplication results concatenated along the rows

%// Define size paramters and then define random inputs with those
z = 500; y = 500; num_mat = 500;
M = rand(z,num_mat*y);
D = rand(y,y);

%// Warm up tic/toc.
for k = 1:100000
    tic(); elapsed = toc();
end

disp('---------------------------- With loopy approach')
tic
out1 = zeros(y*num_mat,z);
for k1 = 1:y:y*num_mat
    out1(k1:k1+y-1,:) = D*M(:,k1:k1+y-1).'; %//'
end
toc, clear out1 k1

disp('---------------------------- With proposed approach')
tic
mults = D*reshape(permute(reshape(M,z,y,[]),[2 1 3]),y,[]);
out2 = reshape(permute(reshape(mults,y,z,[]),[1 3 2]),[],z);
toc

Runtimes

Case I:

---------------------------- With loopy approach
Elapsed time is 3.889852 seconds.
---------------------------- With proposed approach
Elapsed time is 3.051376 seconds.

Case II:

---------------------------- With loopy approach
Elapsed time is 3.798058 seconds.
---------------------------- With proposed approach
Elapsed time is 3.292559 seconds.

Conclusions

The runtimes suggest about a good 25% speedup with the proposed vectorized approach! So, hopefully this works out for you!



回答3:

If you want to get A, B, and C from a bigger matrix you can do this, assuming the bigger matrix is called X:

A = X(:,1:y)
B = X(:,y+1:2*y)
C = X(:,2*y+1:3*y)

If there are N such matrices, the best way is to use reshape like:

F = reshape(X, x,y,N)

Then use a loop to generate a new matrix I call it F1 as:

F1=[];
for n=1:N
   F1 = [F1 F(:,:,n)'];
end

Then compute F2 as:

F2 = D*F1;

and finally get your result as:

R = reshape(F2,N*y,x)

Note: this for loop does not slow you down as it is just to reformat the matrix and the multiplication is done in matrix form.