I'm trying to calculate a summed area table of a feature count matrix using python and numpy. Currently I'm using the following code:
def summed_area_table(img):
table = np.zeros_like(img).astype(int)
for row in range(img.shape[0]):
for col in range(img.shape[1]):
if (row > 0) and (col > 0):
table[row, col] = (img[row, col] +
table[row, col - 1] +
table[row - 1, col] -
table[row - 1, col - 1])
elif row > 0:
table[row, col] = img[row, col] + table[row - 1, col]
elif col > 0:
table[row, col] = img[row, col] + table[row, col - 1]
else:
table[row, col] = img[row, col]
return table
The above code takes about 35 seconds to perform the calculation on a 3200 x 1400 array. Is there any way to use Numpy trick to speed up the computation? I realize the fundamental speed problem lies in the nested python loops, but I don't know how to avoid them.
There's a NumPy function
cumsum
for cumulative sums. Applying it twice yields the desired table:Output:
Performance analysis: (https://stackoverflow.com/a/25351344/3419103)
Output: