I would like to evaluate
E = np.einsum('ij,jk,kl->ijkl',A,A,A)
F = np.einsum('ijki->ijk',E)
where A is a matrix (no more than 1000 by 1000 in size). Computing E is slow. I would like to speed this up by only computing the "diagonal" elements which I store in F. Is it possible to combine these two expressions?/Are there any better ways to speed up this computation?
I'm not sure if there is an automatic way, but you can always do the maths yourself and give einsum
the final expression:
F = np.einsum('ij,jk,ki->ijk', A, A, A)
In [86]: A=np.random.randint(0,100,(100,100))
In [88]: E1=np.einsum('ijki->ijk',np.einsum('ij,jk,kl->ijkl',A,A,A))
In [89]: E2=np.einsum('ij,jk,ki->ijk',A,A,A)
In [90]: np.allclose(E1,E2)
Out[90]: True
Good time improvement - 100x, corresponding to the saved dimension (l
)
In [91]: timeit np.einsum('ijki->ijk',np.einsum('ij,jk,kl->ijkl',A,A,A))
1 loops, best of 3: 1.1 s per loop
In [92]: timeit np.einsum('ij,jk,ki->ijk',A,A,A)
100 loops, best of 3: 10.9 ms per loop
einsum
performs a combined iteration over all the indices, albeit in Cython code. So reducing the number of indices can have a significant time savings. Looks like doing that i...i
combination works in the initial calc.
With only 2g of memory, the (1000,1000) is too large 'iterator too large' in the E1 case, 'memory error' in the E2 case.