When broadcasting is a bad idea ? (numpy)

2019-04-09 10:47发布

问题:

The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations.

Example 1:

from numpy import array 
a = array([1.0,2.0,3.0])
b = array([2.0,2.0,2.0]) # multiply element-by-element ()
a * b
>> array([ 2.,  4.,  6.])

Example 2 :

from numpy import array
a = array([1.0,2.0,3.0])
b = 2.0  # broadcast b to all a
a * b
>>array([ 2.,  4.,  6.])

We can think of the scalar b being stretched during the arithmetic operation into an array with the same shape as a. Numpy is smart enough to use the original scalar value without actually making copies so that broadcasting operations are as memory and computationally efficient as possible (b is a scalar, not an array)

A small benchmarking made by @Eric Duminil in another memory performance question, shows that broadcasting makes difference in term of speed and memory

Hwoever, I am quoting from the same article linked above:

There are, cases where broadcasting is a bad idea because it leads to inefficient use of memory that slows computation

The question is: When broadcasting uses unnecessarily large amounts of memory and result sluggish performance ? In other terms when we should use hybrid broadcasting/python looping algorithm over the pure broadcasting approch?