V is (n,p) numpy array typically dimensions are n~10, p~20000
The code I have now looks like
A = np.zeros(p)
for i in xrange(n):
for j in xrange(i+1):
A += F[i,j] * V[i,:] * V[j,:]
How would I go about rewriting this to avoid the double python for loop?
While Isaac's answer seems promising, as it removes those two nested for loops, you are having to create an intermediate array
M
which isn
times the size of your originalV
array. Python for loops are not cheap, but memory access ain't free either:If you now take both for a test ride:
So removing the for loops is costing you an 8x slowdown. Not a very good thing... At this point you may even want to consider whether a function taking about 3ms to evaluate requires any further optimization. IN case you do, there's a small improvement which can be had by using
np.einsum
:And now:
So that's roughly a 20% speed up that introduces code that may very well not be as readable as your for loops. I would say not worth it...
The expression can be written as
and thus you can sum it like this using the
np.newaxis
-construct:Here a 3D-array
X[i,j,p]
is constructed, and then the 2 first axes are summed, which results in a 1D arrayA[p]
. AdditionalyF
was multiplied with a triangular matrix to restrict the summation according to the problem.The difficult part about this is that you only want to take the sum of the elements with
j <= i
. If not for that then you could do the following:If
F
is symmetric (ifF[i,j] == F[j,i]
) then you can exploit the symmetry ofM
above as follows:That said, this is really not a great candidate for vectorization, as you have
n << p
and so yourfor
-loops are not going to have much effect on the speed of this computation.Edit: As Bill said below, you can just make sure that the elements of
F
that you don't want to use are set to zero first, and then theM.sum(0).sum(0)
result will be what you want.