Why does `numpy.einsum` work faster with `float32`

2019-02-20 05:30发布

This question already has an answer here:

Python Numpy Data Types Performance 2 answers

In my benchmark using numpy 1.12.0, calculating dot products with float32 ndarrays is much faster than the other data types:

In [3]: f16 = np.random.random((500000, 128)).astype('float16')
In [4]: f32 = np.random.random((500000, 128)).astype('float32')
In [5]: uint = np.random.randint(1, 60000, (500000, 128)).astype('uint16')

In [7]: %timeit np.einsum('ij,ij->i', f16, f16)
1 loop, best of 3: 320 ms per loop

In [8]: %timeit np.einsum('ij,ij->i', f32, f32)
The slowest run took 4.88 times longer than the fastest. This could mean that an intermediate result is being cached.
10 loops, best of 3: 19 ms per loop

In [9]: %timeit np.einsum('ij,ij->i', uint, uint)
10 loops, best of 3: 43.5 ms per loop

I've tried profiling einsum, but it just delegates all the computing to a C function, so I don't know what's the main reason for this performance difference.

标签： python numpy numpy-einsum

1条回答

\"骚年 ilove

2楼-- · 2019-02-20 06:02

My tests with your f16 and f32 arrays shows that f16 is 5-10x slower for all calculations. It's only when doing byte level operations like array copy does more compact nature of float16 show any speed advantage.

https://gcc.gnu.org/onlinedocs/gcc/Half-Precision.html

Is the section in the gcc docs about half floats, fp16. With the right processor and right compiler switches, it may possible to install numpy in way that speeds up these calculations. We'd also have to check if numpy .h files have any provision for special handling of half floats.

Earlier questions, may be good enough to be duplicate references

Python Numpy Data Types Performance

Python numpy float16 datatype operations, and float8?

0人赞添加讨论(0) 举报

Why does `numpy.einsum` work faster with `float32`

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间