Recursive method 10x slower than interative [close

2019-07-03 02:14发布

问题:

The code is cleaned as far as i could to show my problem. Basically its an octree search+intersect. the intersect function comes from an SDK (the whole project is a plugin for rhino). If i make a loop with intersection calls, its 10 times faster than the recursive search through the octree. Stranger even - i isolated the timing of the intersect call - and inside the recursive it was 8 times slower than in a loop. There could be probably 1000 reasons why it behaves like this, but i hope i made some obvious mistake someone can spot by looking over the code.

there is the slow recusive piece:

public void NewRayCast()
{            
    int runs = 500000;                          //how many rays we cast
    Point3d raypos = new Point3d(0, 0, 0);      //raystart
    Ray3d ray = new Ray3d();                

    Random r = new Random();                    //here we create targets to scatter the ray directions
    Vector3d[] targets = new Vector3d[runs];
    for (int i = 0; i < runs; i++)
    {
        targets[i] = new Vector3d(r.NextDouble() * 200 - 100, 500, r.NextDouble() * 200 - 100);
    }

    for (int i = 0; i < runs; i++)
    {
        boxes = new List<int>();            // this collects the octree leaves the rays hits
        rayline.From = ray.Position;
        rayline.To = ray.Position + ray.Direction;                
        LineBoxer(starter);                 // this starts the raycasting - starter is a array with 1 value (the scene bounding box)
    }            
}

public void LineBoxer(int[] check)    // Cast a ray against boxes
{            
    for (int i = 0; i < check.Length; i++)  // check only because the first call is with the scene box (1 index)
    {
        if (octmanB.node[check[i]].closed == false) // if node is a dead end > empty we skip it
        {                                       
            isect = Rhino.Geometry.Intersect.Intersection.LineBox(rayline, octmanB.node[check[i]].bbox, 0, out ival); // intersection function, changing to an arbitrary bounding box doesnt speed it up either                    
            if (isect == true)
            {                        
                if (octmanB.node[check[i]].leaf == false)   // continue with recursion
                {
                    LineBoxer(octmanB.node[check[i]].child);
                }
                else
                {
                    boxes.Add(check[i]);    // here we have a leaf
                }
            }
        }
    }
}

and here the fast one:

public void FasterTestRun()
{
    int runs = 500000;                       
    Line line = new Line(new Point3d(1, 1, 1), new Point3d(0, 1000, 0));
    BoundingBox box = new BoundingBox(new Point3d(-50, 50, -50), new Point3d(50, 150, 50));

    bool isect;
    Interval v = new Interval();

    Random r = new Random();
    Point3d[] targets = new Point3d[runs];
    for (int i = 0; i < runs; i++)
    {
        targets[i] = new Point3d(r.NextDouble() * 20 - 10, 1000, r.NextDouble() * 20 - 10);
    }

    for (int i = 0; i < runs; i++)
    {
        line.To = targets[i];                
        isect = Rhino.Geometry.Intersect.Intersection.LineBox(line, box, 0, out v);                
    }            
}

thanks!

回答1:

Now that I am at home I can finally answer your question, so let's begin...

Recursion

First of all: Recursion is always slower than iteration, if you are working with structured programming languages. You cannot generalize this, since function calls in functional programming languages are faster (functions are differently defined there). For more information Wikipedia is a good source.

In detail a recursive call pushes all local variables the function (or method) allocated onto the stack, wait's until the inner call returns (this includes the same procedure on and on and on...) and finally pops the values from the call-stack and continues working with them. This is not only heavy memory load, this is also pain for the garbage collector: the longer your function must wait, the longer your objects last in memory, aging and finally reaching gen1 or gen2. Which means it takes actually long until they get released.

Another problem I can see is the following:

public void LineBoxer(int[] check)
{
    // ...
    LineBoxer(octmanB.node[check[i]].child);
    // ...
}

Passing arrays recursively means that all of the values of the array reside on the stack for that long time. Even if most of the elements are ready to be released!

If you are doing the same thing iteratively there is no stressing for the stack. The variables allocated are pretty often temporary variables and can get released quite quickly. And memory-allocation is the bottleneck in here. This,(and because you asked for it in the comments) is why I will continue going a little bit more into detail.

Improving performance - in theory

However (in the comments) you are also asking on how to handle the raytracing more efficently. Basicly you are right with using an octree. The basic operation you want to perform is a simple search. And there's your problem. From as far as I understand your code, you are testing each and every octree leaf if it got hit or not:

public void LineBoxer(int[] check)    // Cast a ray against boxes
{            
    for (int i = 0; i < check.Length; i++)
    {
        // ...
    }
}

That's simply searching all for all boxes that intersect with your ray. But that's not the motivation of introducing trees. You can imagine an octree like an binary search tree that's extented to 3 dimensions. A binary search tree can search entries in 1 dimension (e.g. a list or array). To search for information in two-dimensional constructs you can use quadtrees. And now we need to add another dimension (because we are in 3D now), so we use octrees.

So far so good, but how can trees help us to perform search operations "better"?

That's because of the divide and conquer priciple. If we are searching for something specific in a larger set of information we divide the set into small pieces. We do this just as long until we found the specific thing we are looking for.

For our octree this means that we divide a cube into 8 smaller cubes. Now we test each box if our ray intersects with it. In the best case it intersects with exactly one box. That's the one to check further. But if each box contains 1000 boxes we simply get rid of 7000 checks by one additional check!

Now we do this again and again until we found one or more leafs. Doing this recursively should not be much slower than doing this iteratively. Imagine an octree with 100.000 nodes. The first cube can store 8 cubes, those 8 cubes can store 64 cubes (depth: 2!) and so on...

  • Depth = 3 : 512 cubes
  • Depth = 4 : 4.096 cubes
  • Depth = 5 : 32.768 cubes
  • Depth = 6 : 262.144 cubes
  • Depth = 7 : 2.097.152 cubes
  • ...

And our maximum number of checks is never more than 8 x Depth elements if we are searching for one specific box. So we increased our algorithm performance from O(n) to O(log(n)). 1

Fine, but how to apply this to your problem?

First of all: You are working with C# - an object orientated language. So use objects! It is pretty simple to apply the divide and conquer principle to your tree structure if you are encapsulating everything within a object structure.

In your specific case this means:

public class OctreeNode
{
    public bool IsLeaf { get; private set; }
    public OctreeNode[8] Children { get; private set; }

    public OctreeNode()
    {
        this.IsLeaf = true;
        this.Children = null;
    }

    // Don't forget to implement methods to build up the tree and split the node when required.
    // Splitting results in a tree structure. Inside the split-method 
    // you need to allocate the Children-Array and set the IsLeaf-Property to false.

    public bool Intersects(Line rayline, ref ICollection<OctreeNodes> Nodes)
    {
        Interval interval;

        // If the current node does not intersect the ray, then we do not need to
        // investigate further on from here.
        if (!Rhino.Geometry.Intersect.Intersection.LineBox(rayline, this.GetBoundingBox(), 0, out interval))
        {
            return false;
        }

        // If this is a leaf (in which we are interested in) we add it to 
        // the nodes-collection.
        if (this.IsLeaf)
        {
            Nodes.Add(this);
            return true;
        }

        // Not a leaf, but our box intersects with the ray, so we need to investigate further.

        for (int i = 0; i < 8; ++i)
        {
            // Recursive call here, but the result doesn't actually matter.
            this.Children[i].Intersects(rayline, Nodes)
        }

        // The ray intersects with our current node.
        return true;
    }
}

This will do all the magic for you! It tests the tree only until the depth where the test fails and continues until you have all leafs that the ray intersects with. You can also sort them in a way on "who got the logest intersection interval", to bring the objects inside into a higher priority when streaming them, but since I am totally failing the topic now I will continue:

You can even further speed up this algorithm by applying parallelism. This is pretty easy using TPL, which is pretty simple in here:

Parallel.For<int> (0; 8; i =>
{
    this.Children[i].Intersects(rayline, Nodes)
});

Okay, that's enought for the moment, I think. I hope this helps you and some more people around there. :)

Note: I am not very familiar with rhino3d. Maybe there is built-in functionality that can help you solving your problem even better!

Note 2: Forgive me, when I am not 100% correct on all points, I haven't done those theoretical considerations for a while...

1 In best case, where we are searching for one specific leaf in a completely balanced tree.