I'm trying to compute the cross-products of many 3x1 vector pairs as fast as possible. This
n = 10000
a = np.random.rand(n, 3)
b = np.random.rand(n, 3)
numpy.cross(a, b)
gives the correct answer, but motivated by this answer to a similar question, I thought that einsum
would get me somewhere. I found that both
eijk = np.zeros((3, 3, 3))
eijk[0, 1, 2] = eijk[1, 2, 0] = eijk[2, 0, 1] = 1
eijk[0, 2, 1] = eijk[2, 1, 0] = eijk[1, 0, 2] = -1
np.einsum('ijk,aj,ak->ai', eijk, a, b)
np.einsum('iak,ak->ai', np.einsum('ijk,aj->iak', eijk, a), b)
compute the cross product, but their performance is disappointing: Both methods perform much worse than np.cross
:
%timeit np.cross(a, b)
1000 loops, best of 3: 628 µs per loop
%timeit np.einsum('ijk,aj,ak->ai', eijk, a, b)
100 loops, best of 3: 9.02 ms per loop
%timeit np.einsum('iak,ak->ai', np.einsum('ijk,aj->iak', eijk, a), b)
100 loops, best of 3: 10.6 ms per loop
Any ideas of how to improve the einsum
s?
The count of multiply operation of
einsum()
is more thencross()
, and in the newest NumPy version,cross()
doesn't create many temporary arrays. Soeinsum()
can't be faster thancross()
.Here is the old code of cross:
Here is the new code of cross:
To speedup it, you need cython or numba.
You can bring in matrix-multiplication using
np.tensordot
to lose one of the dimensions at the first level and then usenp.einsum
to lose the other dimension, like so -Alternatively, we can perform broadcasted elementwise multiplications between
a
andb
usingnp.einsum
and then lose the two dimensions in one go withnp.tensordot
, like so -We could have performed the elementwise multiplications by introducing new axes too with something like
a[...,None]*b[:,None]
, but it seems to slow it down.Though, these show good improvement over the proposed
np.einsum
only based approaches, but fail to beatnp.cross
.Runtime test -