I want to multiply an n-dim stack of m* m matrices by an n-dim stack of vectors (length m), so that the resulting m*n array contains the result of the dot product of the matrix and vector in the nth entry:
vec1=np.array([0,0.5,1,0.5]); vec2=np.array([2,0.5,1,0.5])
vec=np.transpose(n.stack((vec1,vec2)))
mat = np.moveaxis(n.array([[[0,1,2,3],[0,1,2,3],[0,1,2,3],[0,1,2,3]],[[-1,2.,0,1.],[0,0,-1,2.],[0,1,-1,2.],[1,0.1,1,1]]]),0,2)
outvec=np.zeros((4,2))
for i in range(2):
outvec[:,i]=np.dot(mat[:,:,i],vec[:,i])
Inspired by this post Element wise dot product of matrices and vectors, I have tried all different perturbations of index combinations in einsum, and have found that
np.einsum('ijk,jk->ik',mat,vec)
gives the correct result.
Unfortunately I really do not understand this - I assumed the fact that I repeat the entry k in the 'ijk,jk' part means that I multiply AND sum over k. I've tried to read the documentation https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.einsum.html, but I still don't understand.
(My previous attempts included,
np.einsum('ijk,il->ik', mat, vec)
I'm not even sure what this means. What happens to the index l when I drop it?)
Thanks in advance!
Read up on Einstein summation notation.
Basically, the rules are:
Without a
->
With a
->
So, for example, with matrices
A
andB
wih same shape:einsum
is easy (when you had played with permutation of indices for a while, that is...).Let's work with something simple, a triple stack of 2×2 matrices and a triple stack of 2×, arrays
We need to know what we are going to compute using
einsum
What we have done? we have an index
i
that is fixed when we perform a dot product between one of the stacked matrices and one of the stacked vectors (both indexed byi
) and the individual output line implies a summation over the last index of the stacked matrix and the lone index of the stacked vector.This is easily encoded in an
einsum
directivei
index to specify the matrix, the vector and also the output,k
j
Hence
I hope that my discussion of how I got the directive right is clear, correct and useful.
A nice thing about the
stack
function is we can specify an axis, skipping the transpose:Why the mix of
np.
andn.
?NameError: name 'n' is not defined
. That kind of thing almost sends me away.From your loop, the location of the
i
axis (size 2) is clear - last in all 3 arrays. That leaves one axis forvec
, lets call thatj
. It pairs with the last (next toi
ofmat
).k
carries over frommat
tooutvec
.Often the
einsum
string writes itself. For example ifmat
was described as (m,n,k) andvec
as (n,k), with the result being (m,k)In this case only the
j
dimension is summed - it appears on the left, but on the right. The last dimension,i
in my notation, is not summed because if appears on both sides, just as it does in your iteration. I think of that as 'going-along-for-the-ride'. It isn't actively part of thedot
product.You are, in effect, stacking on the last dimension, size 2 one. Usually we stack on the first, but you transpose both to put that last.
Your 'failed' attempt runs, and can be reproduced as:
The
j
andl
dimensions don't appear on the right, so they are summed. They can be summed before multiplying because they appear in only one term each. I added theNone
to enable broadcasting (multiplying aik
withi
).If you'd stacked on the first, and added a dimension for
vec
(2,4,1), it wouldmatmul
with a (2,4,4) mat.mat @ vec[...,None]
.