Consider the following code using numpy arrays which is very slow :
# Intersection of an octree and a trajectory
def intersection(octree, trajectory):
# Initialize numpy arrays
ox = octree.get("x")
oy = octree.get("y")
oz = octree.get("z")
oe = octree.get("extent")/2
tx = trajectory.get("x")
ty = trajectory.get("y")
tz = trajectory.get("z")
result = np.zeros(np.size(ox))
# Loop over elements
for i in range(0, np.size(tx)):
for j in range(0, np.size(ox)):
if (tx[i] > ox[j]-oe[j] and
tx[i] < ox[j]+oe[j] and
ty[i] > oy[j]-oe[j] and
ty[i] < oy[j]+oe[j] and
tz[i] > oz[j]-oe[j] and
tz[i] < oz[j]+oe[j]):
result[j] += 1
# Finalize
return result
How to rewrite the function to speed up the calculation ? (np.size(tx) == 10000
and np.size(ox) == 100000
)
You are allocating 10000 lists of size 100000. The first thing to do would be to stop using
range
for the nestedj
loop and use the generator versionxrange
instead. This will save you time and space allocating all those lists.The next one would be to use vectorized operations:
I think this will give the same result for the double loop and be faster:
To get this: 1) reorder the loops (ie, swap them) which is valid since nothing changes within the loops; 2) pull
result[j]
outside thei
loop; 3) convert all thet>ox-oe and t<ox+oe
toabs(t-ox)<oe
(though this may not be a huge speedup, it's easier to read).Since you don't have runnable code, and I didn't want to build a test for this, I'm not 100% sure this is correct.
You're likely to get good results by running this code under PyPy: http://pypy.org/ (instructions for our NumPy integration at https://bitbucket.org/pypy/numpy)