I have three numpy arrays:
X
: a 3073 x 49000 matrix
W
: a 10 x 3073 matrix
y
: a 49000 x 1 vector
y
contains values between 0 and 9, each value represents a row in W
.
I would like to add the first column of X
to the row in W
given by the first element in y
. I.e. if the first element in y
is 3, add the first column of X
to the fourth row of W
. And then add the second column of X
to the row in W
given by the second element in y
and so on, until all columns of X
has been aded to the row in W
specified by y
, which means a total of 49000 added rows.
W[y] += X.T
does not work for me, because this will not add more than one vector to a row in W
.
Please note: I'm only looking for vectorized solutions. I.e. no for-loops.
EDIT: To clarify I'll add an example with small matrix sizes adapted from Salvador Dali's example below.
In [1]: import numpy as np
In [2]: a, b, c = 3, 4, 5
In [3]: np.random.seed(0)
In [4]: X = np.random.randint(10, size=(b,c))
In [5]: W = np.random.randint(10, size=(a,b))
In [6]: y = np.random.randint(a, size=(c,1))
In [7]: X
Out[7]:
array([[5, 0, 3, 3, 7],
[9, 3, 5, 2, 4],
[7, 6, 8, 8, 1],
[6, 7, 7, 8, 1]])
In [8]: W
Out[8]:
array([[5, 9, 8, 9],
[4, 3, 0, 3],
[5, 0, 2, 3]])
In [9]: y
Out[9]:
array([[0],
[1],
[1],
[2],
[0]])
In [10]: W[y.ravel()] + X.T
Out[10]:
array([[10, 18, 15, 15],
[ 4, 6, 6, 10],
[ 7, 8, 8, 10],
[ 8, 2, 10, 11],
[12, 13, 9, 10]])
In [11]: W[y.ravel()] = W[y.ravel()] + X.T
In [12]: W
Out[12]:
array([[12, 13, 9, 10],
[ 7, 8, 8, 10],
[ 8, 2, 10, 11]])
The problem is to get BOTH column 0 and column 4 in X added to row 0 in W, as well as both column 1 and 2 in X added to row 1 in W.
The desired outcome is thus:
W = [[17, 22, 16, 16],
[ 7, 11, 14, 17],
[ 8, 2, 10, 11]]
Vectorized approaches
Approach #1
Based on
this answer
, here's a vectorized solution usingnp.bincount
-Approach #2
You can make good usage of
boolean indexing
andnp.einsum
to get the job done in a concise vectorized manner -Loopy approaches
Approach #3
Since you are selecting and adding up a huge number of columns from
X
per uniquey
, it might be better in terms of performance to run a loop withcomplexity
equal to the number of such uniquey's
, which seems to be atmax
equal to the number of rows inW
and that in your case is just10
. Thus, the loop has just 10 iterations, not bad! Here's the implementation to fulfill those aspirations -Approach #4
You can bring in
np.einsum
to do the columnwise summations and have the final output like so -First the straight forward loop solution as reference:
An
add.at
solution:add.at
does an unbuffered calculation, getting around the buffering that preventsW[y.ravel()] += X.T
from working. It is still iterative, but the loop has been moved to compiled code. It isn't true vectorization because the order of application matters. The addition for one row ofX.T
depends on the results from the previous rows.https://stackoverflow.com/a/20811014/901925 is the answer I gave a couple of years ago to a similar question (for 1d arrays).
But when dealing with your large arrays:
this can run into speed issues. Note that
W[y.ravel()]
is the same size asX.T
(why did you pick these sizes that require transpose?). And it's a copy, not a view. So there's already a time penalty.bincount
has been suggested in previous questions, and I think it is faster. Making for loop with index arrays faster (both bincount and add.at solutions)Iterating over the small 3073 dimension could also have speed advantages. Or better yet on the size 10 dimension as
Divakar
demonstrates.For the small test case,
a,b,c=3,4,5
, theadd.at
solution is fastest, withDivakar's
bincount
andeinseum
next. For a largera,b,c=10,1000,20000
,add.at
gets very slow, withbincount
being the fastest.Related SO answers
https://stackoverflow.com/a/28205888/901925 (notes that
bincount
requires complete coverage fory
).https://stackoverflow.com/a/30041823/901925 (where
Divakar
again shows thatbincount
rules!)While I can't say that this is very pythonic, it is a solution (I think):
This will achieve what you want:
X + W[y.ravel()].T
To see that this really does the work, here is a reproducible example:
Now your matrices are:
And
W[y.ravel()]
gives you " W given by the first element in y". By transposing it, you will get a matrix ready to be added to X: