Let's say I have two matrices A and B
A = rand(4,5,3);
B = rand(4,5,6)
I want to apply the function 'corr2' to calculate the correlation coefficients.
corr2(A(:,:,1),B(:,:,1))
corr2(A(:,:,1),B(:,:,2))
corr2(A(:,:,1),B(:,:,3))
...
corr2(A(:,:,1),B(:,:,6))
...
corr2(A(:,:,2),B(:,:,1))
corr2(A(:,:,2),B(:,:,2))
...
corr2(A(:,:,3),B(:,:,6))
How to avoid using loops to create such a vectorization?
Hacked into the m-file for corr2
to create a customized vectorized version for working with 3D arrays. Proposed here are two approaches with bsxfun
(of course!)
Approach #1
szA = size(A);
szB = size(B);
a1 = bsxfun(@minus,A,mean(mean(A)));
b1 = bsxfun(@minus,B,mean(mean(B)));
sa1 = sum(sum(a1.*a1));
sb1 = sum(sum(b1.*b1));
v1 = reshape(b1,[],szB(3)).'*reshape(a1,[],szA(3));
v2 = sqrt(sb1(:)*sa1(:).');
corr3_out = v1./v2; %// desired output
corr3_out
stores corr2
results between all 3D slices of A
and B
.
Thus, for A = rand(4,5,3), B = rand(4,5,6)
, we would have corr3_out
as a 6x3
array.
Approach #2
Slightly different approach to save on few calls to sum
and mean
by using reshape
instead -
szA = size(A);
szB = size(B);
dim12 = szA(1)*szA(2);
a1 = bsxfun(@minus,A,mean(reshape(A,dim12,1,[])));
b1 = bsxfun(@minus,B,mean(reshape(B,dim12,1,[])));
v1 = reshape(b1,[],szB(3)).'*reshape(a1,[],szA(3));
v2 = sqrt(sum(reshape(b1.*b1,dim12,[])).'*sum(reshape(a1.*a1,dim12,[])));
corr3_out = v1./v2; %// desired output
Benchmarking
Benchmark code -
%// Create random input arrays
N = 55; %// datasize scaling factor
A = rand(4*N,5*N,3*N);
B = rand(4*N,5*N,6*N);
%// Warm up tic/toc
for k = 1:50000
tic(); elapsed = toc();
end
%// Run vectorized and loopy approach codes on the input arrays
%// 1. Vectorized approach
%//... solution code (Approach #2) posted earlier
%// clear variables used
%// 2. Loopy approach
tic
s_A=size(A,3);
s_B=size(B,3);
out1 = zeros(s_B,s_A);
for ii=1:s_A
for jj=1:s_B
out1(jj,ii)=corr2(A(:,:,ii),B(:,:,jj));
end
end
toc
Results -
-------------------------- With BSXFUN vectorized solution
Elapsed time is 1.231230 seconds.
-------------------------- With loopy approach
Elapsed time is 139.934719 seconds.
MATLAB-JIT lovers show some love here! :)
Some examples, yet none is better than loops. As Divakar says in a comment below this is not a vectorized solution.
CODE:
A = rand(4,5,1000);
B = rand(4,5,200);
s_A=size(A,3);
s_B=size(B,3);
%%% option 1
tic
corr_AB=cell2mat(arrayfun(@(indx1) arrayfun(@(indx2) corr2(A(:,:,indx1),B(:,:,indx2)),1:s_B),1:s_A,'UniformOutput',false));
toc
%%% option 2
tic
indx1=repmat(1:s_A,s_B,1);
indx1=indx1(:);
indx2=repmat(1:s_B,1,s_A);
indx2=indx2(:);
indx=[indx1,indx2];
corr_AB=arrayfun(@(i) corr2(A(:,:,indx(i,1)),B(:,:,indx(i,2))),1:size(indx,1));
toc
%%% option 3
tic
a=1;
for i=1:s_A
for j=1:s_B
corr_AB(a)=corr2(A(:,:,i),B(:,:,j));
a=a+1;
end
end
toc
OUTPUT:
Elapsed time is 9.655696 seconds.
Elapsed time is 9.398979 seconds.
Elapsed time is 8.489744 seconds.