While working on the simulation of particle interactions, I stumbled across grid indexing in Morton-order (Z-order)(Wikipedia link) which is regarded to provide an efficient nearest neighbor cell search. The main reason that I've read is the almost sequential ordering of spatially close cells in memory.
Being in the middle of a first implementation, I can not wrap my head around how to efficiently implement the algorithm for the nearest neighbors, especially in comparison to a basic uniform grid.
Given a cell (x,y) it is trivial to obtain the 8 neighbor cell indices and compute the respective z-index. Although this provides constant access time to the elements, the z-index has either to be calculated or looked up in predefined tables (separate for each axis and OR'ing). How can this possibly be more efficient? Is it true, that accessing elements in an array A in an order say A[0] -> A1 -> A[3] -> A[4] -> ... is more efficient than in an order A[1023] -> A[12] -> A[456] -> A[56] -> ...?
I've expected that there exists a simpler algorithm to find the nearest neighbors in z-order. Something along the lines: find first cell of neighbors, iterate. But this can't be true, as this works nicely only within 2^4 sized blocks. There are two problems however: When the cell is not on the boundary, one can easily determine the first cell of the block and iterate through the cells in the block, but one has to check whether the cell is a nearest neighbor. Worse is the case when the cell lies on the boundary, than one has to take into account 2^5 cells. What am I missing here? Is there a comparatively simple and efficient algorithm that will do what I need?
The question in point 1. is easily testable, but I'm not very familiar with the underlying instructions that the described access pattern generates and would really like to understand what is going on behind the scenes.
Thanks in advance for any help, references, etc...
EDIT:
Thank you for clarifying point 1! So, with Z-ordering, the cache hit rate is increased on average for neighbor cells, interesting. Is there a way to profile cache hit/miss rates?
Regarding point 2: I should add that I understand how to build the Morton-ordered array for a point cloud in R^d where the index i = f(x1, x2, ..., xd) is obtained from bitwise interlacing etc. What I try to understand is whether there is a better way than the following naive ansatz to get the nearest neighbors (here in d=2, "pseudo code"):
// Get the z-indices of cells adjacent to the cell containing (x, y)
// Accessing the contents of the cells is irrelevant here
(x, y) \elem R^2
point = (x, y)
zindex = f(x, y)
(zx, zy) = f^(-1)(zindex) // grid coordinates
nc = [(zx - 1, zy - 1), (zx - 1, zy), (zx - 1, zy + 1), // neighbor grid
(zx , zy - 1), (zx, zy + 1), // coordinates
(zx + 1, zy - 1), (zx + 1, zy), (zx + 1, zy + 1)]
ni= [f(x[0], x[1]) for x in nc] // neighbor indices