With the recent update to Numpy (1.14), I found that it breaks my entire codebase. This is based on changing the default numpy einsum optimize argument from False to True.
As a result, the following basic operation now fails:
a = np.random.random([50, 2, 2])
b = np.random.random([50, 2])
np.einsum('bdc, ac -> ab', a, b, optimize=True)
with the following error trace:
ValueError Traceback (most recent call last)
<ipython-input-71-b0f9ce3c71a3> in <module>()
----> 1 np.einsum('bdc, ac -> ab', a, b, optimize=True)
C:\ProgramData\Anaconda3\lib\site-packages\numpy\core\einsumfunc.py in
einsum(*operands, **kwargs)
1118
1119 # Contract!
-> 1120 new_view = tensordot(*tmp_operands, axes=
(tuple(left_pos), tuple(right_pos)))
1121
1122 # Build a new view if needed
C:\ProgramData\Anaconda3\lib\site-packages\numpy\core\numeric.py in
tensordot(a, b, axes)
1301 oldb = [bs[axis] for axis in notin]
1302
-> 1303 at = a.transpose(newaxes_a).reshape(newshape_a)
1304 bt = b.transpose(newaxes_b).reshape(newshape_b)
1305 res = dot(at, bt)
ValueError: axes don't match array
The operation I'm requesting from einsum seems very simple... so why does it fail? If I set "optimize=False", it works just fine.
I tried exploring with einsum_path but the resulting path info was the same with and without optimization.
In [40]: a=np.ones((50,2,2),int); b=np.ones((50,2),int)
In [41]: np.einsum('bdc,ac->ab', a, b)
...
ValueError: axes don't match array
I don't see what optimization has to do with this error.
For the first argument b,d,c
are 50,2,2. For the second a,c
are 50,2. The result should be 50,50. But what happened to d
?
In [43]: np.einsum('bdc,ac->abd', a, b).shape
Out[43]: (50, 50, 2)
Oops:
In [49]: np.einsum('bdc,ac->ab', a, b, optimize=False).shape
Out[49]: (50, 50)
So it's summing on d
.
Note the error - with optimization, it uses tensordot
(transpose plus dot
), rather than the original einsum
nditer
approach.
c_einsum
is the one that can handle that missing d
:
# If no optimization, run pure einsum
if optimize_arg is False:
return c_einsum(*operands, **kwargs)
Tried some timings:
Two step calculation with the default optimize:
In [64]: timeit np.einsum('abd->ab', np.einsum('bdc,ac->abd', a, b))
288 µs ± 518 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
The desired c_einsum
is faster
In [65]: timeit np.einsum('bdc,ac->ab', a, b, optimize=False)
170 µs ± 83.2 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In fact c_einsum
is faster even when the tensordot
version works
In [67]: timeit np.einsum('bdc,ac->abd', a, b,optimize=False)
73.1 µs ± 46.6 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [68]: timeit np.einsum('bdc,ac->abd', a, b,optimize=True)
207 µs ± 6.97 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
This example might be too small to show of the advantages of tensordot/blas
.
Looks like this has been raised on github - both the failures, and the slower 'optimize' speed: https://github.com/numpy/numpy/issues/10343 "einsum broadcast regression (with optimize=True)"