I have a problem in understanding the working behind the numpy dot function and broadcasting.Below is the snippet I am trying to understand
a=np.array([[1,2],[3,5]])
if we check the shape of a
a.shape
it will be (2,2)
b=np.array([3,6])
and b.shape is (2,)
Question1: is b
column vector or row vector? while providing input it seems b
is row vector but then shape shows it as a column vector having 2 rows.What is the fault in my understanding?
now if do
a.dot(b)
it result in
array([15,39])
Question2: as per matrix multiplication if a
is m*n
then b
must be n*k
and since a
is 2*2 then b
must be 2*1. Does this verify that b
is a column vector otherwise if it would a row vector then matrix multiplication shall not be possible, but the output of the dot product does give the value according to the matrix multiplication considering b
as column vector and broadcasting it
now b.dot(a)
is also possible and results in
array([21,36])
and
this blew my mind.How are they checking the compatibility of the vector for matrix multiplication and how do they calculate?
In at least one of the scenario, they must throw the error for incompatible dimension for multiplication.But it is not shown and they are computing the result in both of the cases.
The way that numpy is programmed means that a 1D array, shape=(n,)
, is treated as neither a column or a row vector, but can act like either one based on the position in a dot product. To better explain this consider comparing the case with an asymmetric array to a symmetric array:
>>>a=numpy.arange(3)
>>>a.shape=(1,3)
>>>a
array([0,1,2])
>>>b=numpy.arange(9)
>>>b.shape=(3,3)
>>>b
array([0,1,2]
[3,4,5]
[6,7,8])
Then define a (3,) vector:
>>>c=numpy.arange(3)
>>>c
array([0,1,2])
>>>c.shape
(3,)
In normal linear algebra, if c were a column vector we would expect a.c to make a constant, 1x3 matrix dot with 3x1 column vector, and c.a to produce a 3x3 matrix, 3x1 column times a 1x3 row. Doing this in python you will find that a.dot(c)
will produce a (1,) array (the constant we expect), but c.dot(a)
will raise an error:
>>>d=a.dot(c)
d.shape=(1,)
>>>e=c.dot(a)
ValueError: shapes (3,) and (1,3) not aligned: 3 (dim 0) != 1 (dim 0)
What has gone wrong is that that numpy has checked the only dimension of c against the first dimension of a, not checked the last dimension of c against a. According to numpy a 1D array has only 1 dimension and all checks are done against that dimension. Because of this we find 1D arrays don't act strictly as a column or a row vector. E.g. b.dot(c)
checks the second dimension of b against the one dimension of c (c acts like column vector) and c.dot(b)
checks the one dimension of c against first dimension of b (c acts like a row vector). Therefore, they both work:
>>>f=b.dot(c)
>>>f
array([ 5, 14, 23])
>>>g=c.dot(b)
>>>g
array([15, 18, 21])
To avoid this, you must give your array its second dimension for it to be a row or column vector. In this example you would explicitly say that c.shape=(3,1)
for a column vector or c.shape=(1,3)
for a row vector.
>>>c.shape=(3,1)
>>>c.dot(a)
array([0,0,0]
[0,1,2]
[0,2,4])
>>>h=c.dot(b)
ValueError: shapes (3,1) and (3,3) not aligned: 1 (dim 1) != 3 (dim 0)
>>>c.shape=(1,3)
>>>i=c.dot(b)
>>>i
array([[15, 18, 21]])
The point to take from this is that:
According to numpy, row and column vectors have two dimensions
At first, a=np.array([[1,2],[3,5])
changed as a=np.array([[1,2],[3,5]])
in order to work
A numpy array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.
Answer to your question b shape is 2, that is row size.
a = np.array([1, 2, 3])
a.shape
(3,) #here 3 is row size its one dimensional array.
Dot operator:
numpy.dot
Example:
np.dot(2, 4)
8
Another example with 2D array:
>>> a = [[1, 0], [0, 1]]
>>> b = [[4, 1], [2, 2]]
>>> np.dot(a, b)
array([[4, 1],
[2, 2]])
The dot function to compute inner products of vectors, to multiply a vector by a matrix, and to multiply matrices.
Dot is available both as a function in the numpy module and as an instance method of array objects
For 2-D arrays it is equivalent to matrix multiplication, and for 1-D
arrays to inner product of vectors (without complex conjugation). For
N dimensions it is a sum product over the last axis of a and the
second-to-last of b:
How do they calculate?
b.dot(a) is also possible and results in array([21,36])and this blew
my mind.How are they checking the compatibility of the vector for
matrix multiplication and how do they calculate?
This is basic matrix product.
a
array([[1, 2], #2D array
[3, 5]])
>>> b
array([3, 6]) #1D array
(7*3 6*6) = ([21, 36])