I'm looking for a good ray-octree intersection algorithm, which gives me the leafs the ray passes through in an iterative way. I'm planning on implementing it on the CPU, since I do not want to dive into CUDA just yet :)
At the moment, my Voxel raycaster just does 3D DDA (Amanatides/Woo version) on a non-hierarchic array of XxYxZ voxels. You can imagine that this is pretty costly when there's a lot of empty space, as demonstrated in the following picture (brighter red = more work :) ):
I've already figured out that there are two kinds of algorithms for this task: bottom-up, which works from the leafs back upwards, and top-down, which is basicly depth-first search.
I've already found Revelles' algorithm from 2000, called An efficient parametric algorithm for octree traversal, which looks interesting, but is quite old. This is a top-down algorithm.
The most popular bottom-up approach seems to be K. Sung, A DDA Octree Traversal Algorithm for Ray Tracing, Eurographics'91, North Holland-Elsevier, ISBN 0444 89096 3, p. 73-85. Problem is that most DDA Octree traversal algorithms expect the octree to be of equal depth, which I do not want - empty subtrees should just be a null pointer or something similar.
In the more recent literature about Sparse Voxel Octrees I've managed to read through, (most notably Laine's work on SVO's, they all seem to be based on some kind of GPU-implemented version of DDA (Amanatides/Woo style).
Now, here's my question: does anybody have any experience implementing a basic, no-frills ray-octree intersection algorithm? What would you recommend?
For the record, this is my implementation of the Revelles paper I ended up using:
#include "octree_traversal.h"
using namespace std;
unsigned char a; // because an unsigned char is 8 bits
int first_node(double tx0, double ty0, double tz0, double txm, double tym, double tzm){
unsigned char answer = 0; // initialize to 00000000
// select the entry plane and set bits
if(tx0 > ty0){
if(tx0 > tz0){ // PLANE YZ
if(tym < tx0) answer|=2; // set bit at position 1
if(tzm < tx0) answer|=1; // set bit at position 0
return (int) answer;
}
}
else {
if(ty0 > tz0){ // PLANE XZ
if(txm < ty0) answer|=4; // set bit at position 2
if(tzm < ty0) answer|=1; // set bit at position 0
return (int) answer;
}
}
// PLANE XY
if(txm < tz0) answer|=4; // set bit at position 2
if(tym < tz0) answer|=2; // set bit at position 1
return (int) answer;
}
int new_node(double txm, int x, double tym, int y, double tzm, int z){
if(txm < tym){
if(txm < tzm){return x;} // YZ plane
}
else{
if(tym < tzm){return y;} // XZ plane
}
return z; // XY plane;
}
void proc_subtree (double tx0, double ty0, double tz0, double tx1, double ty1, double tz1, Node* node){
float txm, tym, tzm;
int currNode;
if(tx1 < 0 || ty1 < 0 || tz1 < 0) return;
if(node->terminal){
cout << "Reached leaf node " << node->debug_ID << endl;
return;
}
else{ cout << "Reached node " << node->debug_ID << endl;}
txm = 0.5*(tx0 + tx1);
tym = 0.5*(ty0 + ty1);
tzm = 0.5*(tz0 + tz1);
currNode = first_node(tx0,ty0,tz0,txm,tym,tzm);
do{
switch (currNode)
{
case 0: {
proc_subtree(tx0,ty0,tz0,txm,tym,tzm,node->children[a]);
currNode = new_node(txm,4,tym,2,tzm,1);
break;}
case 1: {
proc_subtree(tx0,ty0,tzm,txm,tym,tz1,node->children[1^a]);
currNode = new_node(txm,5,tym,3,tz1,8);
break;}
case 2: {
proc_subtree(tx0,tym,tz0,txm,ty1,tzm,node->children[2^a]);
currNode = new_node(txm,6,ty1,8,tzm,3);
break;}
case 3: {
proc_subtree(tx0,tym,tzm,txm,ty1,tz1,node->children[3^a]);
currNode = new_node(txm,7,ty1,8,tz1,8);
break;}
case 4: {
proc_subtree(txm,ty0,tz0,tx1,tym,tzm,node->children[4^a]);
currNode = new_node(tx1,8,tym,6,tzm,5);
break;}
case 5: {
proc_subtree(txm,ty0,tzm,tx1,tym,tz1,node->children[5^a]);
currNode = new_node(tx1,8,tym,7,tz1,8);
break;}
case 6: {
proc_subtree(txm,tym,tz0,tx1,ty1,tzm,node->children[6^a]);
currNode = new_node(tx1,8,ty1,8,tzm,7);
break;}
case 7: {
proc_subtree(txm,tym,tzm,tx1,ty1,tz1,node->children[7^a]);
currNode = 8;
break;}
}
} while (currNode<8);
}
void ray_octree_traversal(Octree* octree, Ray ray){
a = 0;
// fixes for rays with negative direction
if(ray.direction[0] < 0){
ray.origin[0] = octree->size[0] - ray.origin[0];
ray.direction[0] = - ray.direction[0];
a |= 4 ; //bitwise OR (latest bits are XYZ)
}
if(ray.direction[1] < 0){
ray.origin[1] = octree->size[1] - ray.origin[1];
ray.direction[1] = - ray.direction[1];
a |= 2 ;
}
if(ray.direction[2] < 0){
ray.origin[2] = octree->size[2] - ray.origin[2];
ray.direction[2] = - ray.direction[2];
a |= 1 ;
}
double divx = 1 / ray.direction[0]; // IEEE stability fix
double divy = 1 / ray.direction[1];
double divz = 1 / ray.direction[2];
double tx0 = (octree->min[0] - ray.origin[0]) * divx;
double tx1 = (octree->max[0] - ray.origin[0]) * divx;
double ty0 = (octree->min[1] - ray.origin[1]) * divy;
double ty1 = (octree->max[1] - ray.origin[1]) * divy;
double tz0 = (octree->min[2] - ray.origin[2]) * divz;
double tz1 = (octree->max[2] - ray.origin[2]) * divz;
if( max(max(tx0,ty0),tz0) < min(min(tx1,ty1),tz1) ){
proc_subtree(tx0,ty0,tz0,tx1,ty1,tz1,octree->root);
}
}
The top-down works very well for me; the upper part of octree may be pointer based so big empty sub-volumes do not take memory; the lower part is more efficient to implement pointer-free... The time complexity to hit the wall is log2(N) (it's apparently the best case). The recursive implementation is quite simple so it is easier to optimize the code. All math can be effectively implemented via integer SSE operations - it takes around x30 CPU cycles to compute the new XYZ coordinates for every sub-volume jump. BTW, the public versions of octree traversals are good only for education - to master truly effective implementation may easily take several months...
Stefan