Explanation on Numpy Broadcasting Answer

2019-02-18 19:32发布

I recently posted a question here which was answered exactly as I asked. However, I think I overestimated my ability to manipulate the answer further. I read the broadcasting doc, and followed a few links that led me way back to 2002 about numpy broadcasting.

I've used the second method of array creation using broadcasting:

N = 10
out = np.zeros((N**3,4),dtype=int)
out[:,:3] = (np.arange(N**3)[:,None]/[N**2,N,1])%N

which outputs:

[[0,0,0,0]
 [0,0,1,0]
 ...
 [0,1,0,0]
 [0,1,1,0]
 ...
 [9,9,8,0]
 [9,9,9,0]]

but I do not understand via the docs how to manipulate that. I would ideally like to be able to set the increments in which each individual column changes.

ex. Column A changes by 0.5 up to 2, column B changes by 0.2 up to 1, and column C changes by 1 up to 10.

[[0,0,0,0]
 [0,0,1,0]
 ...
 [0,0,9,0]
 [0,0.2,0,0]
 ...
 [0,0.8,9,0]
 [0.5,0,0,0]
 ...
 [1.5,0.8,9,0]]

Thanks for any help.

2条回答
ゆ 、 Hurt°
2楼-- · 2019-02-18 20:18

You can adjust your current code just a little bit to make it work.

>>> out = np.zeros((4*5*10,4))
>>> out[:,:3] = (np.arange(4*5*10)[:,None]//(5*10, 10, 1)*(0.5, 0.2, 1)%(2, 1, 10))
>>> out
array([[ 0. ,  0. ,  0. ,  0. ],
       [ 0. ,  0. ,  1. ,  0. ],
       [ 0. ,  0. ,  2. ,  0. ],
       ...
       [ 0. ,  0. ,  8. ,  0. ],
       [ 0. ,  0. ,  9. ,  0. ],
       [ 0. ,  0.2,  0. ,  0. ],
       ...
       [ 0. ,  0.8,  9. ,  0. ],
       [ 0.5,  0. ,  0. ,  0. ],
       ...
       [ 1.5,  0.8,  9. ,  0. ]])

The changes are:

  1. No int dtype on the array, since we need it to hold floats in some columns. You could specify a float dtype if you want (or even something more complicated that only allows floats in the first two columns).
  2. Rather than N**3 total values, figure out the number of distinct values for each column, and multiply them together to get our total size. This is used for both zeros and arange.
  3. Use the floor division // operator in the first broadcast operation because we want integers at this point, but later we'll want floats.
  4. The values to divide by are again based on the number of values for the later columns (e.g. for A,B,C numbers of values, divide by B*C, C, 1).
  5. Add a new broadcast operation to multiply by various scale factors (how much each value increases at once).
  6. Change the values in the broadcast mod % operation to match the bounds on each column.
查看更多
在下西门庆
3楼-- · 2019-02-18 20:26

This small example helps me understand what is going on:

In [123]: N=2    
In [124]: np.arange(N**3)[:,None]/[N**2, N, 1]
Out[124]: 
array([[ 0.  ,  0.  ,  0.  ],
       [ 0.25,  0.5 ,  1.  ],
       [ 0.5 ,  1.  ,  2.  ],
       [ 0.75,  1.5 ,  3.  ],
       [ 1.  ,  2.  ,  4.  ],
       [ 1.25,  2.5 ,  5.  ],
       [ 1.5 ,  3.  ,  6.  ],
       [ 1.75,  3.5 ,  7.  ]])

So we generate a range of numbers (0 to 7) and divide them by 4,2, and 1.

The rest of the calculation just changes each value without further broadcasting

Apply %N to each element

In [126]: np.arange(N**3)[:,None]/[N**2, N, 1]%N
Out[126]: 
array([[ 0.  ,  0.  ,  0.  ],
       [ 0.25,  0.5 ,  1.  ],
       [ 0.5 ,  1.  ,  0.  ],
       [ 0.75,  1.5 ,  1.  ],
       [ 1.  ,  0.  ,  0.  ],
       [ 1.25,  0.5 ,  1.  ],
       [ 1.5 ,  1.  ,  0.  ],
       [ 1.75,  1.5 ,  1.  ]])

Assigning to an int array is the same as converting the floats to integers:

In [127]: (np.arange(N**3)[:,None]/[N**2, N, 1]%N).astype(int)
Out[127]: 
array([[0, 0, 0],
       [0, 0, 1],
       [0, 1, 0],
       [0, 1, 1],
       [1, 0, 0],
       [1, 0, 1],
       [1, 1, 0],
       [1, 1, 1]])
查看更多
登录 后发表回答