What does -1 in numpy reshape mean? [duplicate]

2020-02-10 06:44发布

问题:

 I have a numpy array (A) of shape = (100000, 28, 28)
 I reshape it using A.reshape(-1, 28x28)

This is very common use in Machine learning pipelines. How does this work ? I have never understood the meaning of '-1' in reshape.

An exact question is this But no solid explanation. Any answers pls ?

回答1:

It means, that the size of the dimension, for which you passed -1, is being inferred. Thus,

A.reshape(-1, 28*28)

means, "reshape A so that its second dimension has a size of 28*28 and calculate the correct size of the first dimension".

See documentation of reshape.



回答2:

in numpy, creating a matrix of 100X100 items is like this:

import numpy as np
x = np.ndarray((100, 100))
x.shape  # outputs: (100, 100)

numpy internally stores all these 10000 items in an array of 10000 items regardless of the shape of this object, this allows us to change the shape of this array into any dimensions as long as the number of items on the array does not change

for example, reshaping our object to 10X1000 is ok as we keep the 10000 items:

x = x.reshape(10, 1000)

reshaping to 10X2000 wont work as we does not have enough items on the list

x.reshape(10, 2000)
ValueError: total size of new array must be unchanged

so back to the -1 question, what it does is the notation for unknown dimension, meaning: let numpy fill the missing dimension with the correct value so my array remain with the same number of items.

so this:

x = x.reshape(10, 1000)

is equivalent to this:

x = x.reshape(10, -1) 

internally what numpy does is just calculating 10000 / 10 to get the missing dimension.

-1 can even be on the start of the array or in the middle.

the above two examples are equivalent to this:

x = x.reshape(-1, 1000)

if we will try to mark two dimensions as unknown, numpy will raise an exception as it cannot know what we are meaning as there are more than one way to reshape the array.

x = x.reshape(-1, -1)
ValueError: can only specify one unknown dimension