How do I avoid changing the Stack Size AND avoid g

2019-04-07 03:27发布

问题:

I've been trying to find an answer to this question for a few hours now on the web and on this site, and I'm not quite there.

I understand that .NET allocates 1MB to apps, and that it's best to avoid stack overflow by recoding instead of forcing stack size.

I'm working on a "shortest path" app that works great up to about 3000 nodes, at which point it overflows. Here's the method that causes problems:

    public void findShortestPath(int current, int end, int currentCost)
    {
        if (!weight.ContainsKey(current))
        {
            weight.Add(current, currentCost);
        }
        Node currentNode = graph[current];
        var sortedEdges = (from entry in currentNode.edges orderby entry.Value ascending select entry);
        foreach (KeyValuePair<int, int> nextNode in sortedEdges)
        {
            if (!visited.ContainsKey(nextNode.Key) || !visited[nextNode.Key])
            {
                int nextNodeCost = currentCost + nextNode.Value;
                if (!weight.ContainsKey(nextNode.Key))
                {
                    weight.Add(nextNode.Key, nextNodeCost);
                }
                else if (weight[nextNode.Key] > nextNodeCost)
                {
                    weight[nextNode.Key] = nextNodeCost;
                }
            }
        }
        visited.Add(current, true);
        foreach (KeyValuePair<int, int> nextNode in sortedEdges)
        {
            if(!visited.ContainsKey(nextNode.Key) || !visited[nextNode.Key]){
                findShortestPath(nextNode.Key, end, weight[nextNode.Key]);
            }
        }
    }//findShortestPath

For reference, the Node class has one member:

 public Dictionary<int, int> edges = new Dictionary<int, int>();

graph[] is:

  private Dictionary<int, Node> graph = new Dictonary<int, Node>();

I've tried to opimize the code so that it isn't carrying any more baggage than needed from one iteration (recursion?) to the next, but with a 100K-Node graph with each node having between 1-9 edges it's going to hit that 1MB limit pretty quickly.

Anyway, I'm new to C# and code optimization, if anyone could give me some pointers (not like this) I would appreciate it.

回答1:

A while back I explored this problem in my blog. Or, rather, I explored a related problem: how do you find the depth of a binary tree without using recursion? A recursive tree depth solution is trivial, but blows the stack if the tree is highly imbalanced.

My recommendation is to study ways of solving this simpler problem, and then decide which of them, if any, could be adapted to your slightly more complex algorithm.

Note that in these articles the examples are given entirely in JScript. However, it should not be difficult to adapt them to C#.

Here we start by defining the problem.

http://blogs.msdn.com/ericlippert/archive/2005/07/27/recursion-part-one-recursive-data-structures-and-functions.aspx

The first attempt at a solution is the classic technique that you'll probably adopt: define an explicit stack; use it rather than relying upon the operating system and compiler implementing the stack for you. This is what most people do when faced with this problem.

http://blogs.msdn.com/ericlippert/archive/2005/08/01/recursion-part-two-unrolling-a-recursive-function-with-an-explicit-stack.aspx

The problem with that solution is that it's a bit of a mess. We can go even farther than simply making our own stack. We can make our own little domain-specific virtual machine that has its own heap-allocated stack, and then solve the problem by writing a program that targets that machine! This is actually easier than it sounds; the operations of the machine can be extremely high level.

http://blogs.msdn.com/ericlippert/archive/2005/08/04/recursion-part-three-building-a-dispatch-engine.aspx

And finally, if you are really a glutton for punishment (or a compiler developer) you can rewrite your program in Continuation Passing Style, thereby eliminating the need for a stack at all:

http://blogs.msdn.com/ericlippert/archive/2005/08/08/recursion-part-four-continuation-passing-style.aspx

http://blogs.msdn.com/ericlippert/archive/2005/08/11/recursion-part-five-more-on-cps.aspx

http://blogs.msdn.com/ericlippert/archive/2005/08/15/recursion-part-six-making-cps-work.aspx

CPS is a particularly clever way of moving the implicit stack data structure off the system stack and onto the heap by encoding it in the relationships between a bunch of delegates.

Here are all of my articles on recursion:

http://blogs.msdn.com/ericlippert/archive/tags/Recursion/default.aspx



回答2:

The classic technique to avoid deep recursive stack dives is to simply avoid recursion by writing the algorithm iteratively and managing your own "stack" with an appropriate list data structure. Most likely you will need this approach here given the sheer size of your input set.



回答3:

You could convert the code to use a 'work queue' rather than being recursive. Something along the following pseudocode:

Queue<Task> work;
while( work.Count != 0 )
{
     Task t = work.Dequeue();
     ... whatever
     foreach(Task more in t.MoreTasks)
         work.Enqueue(more);
}

I know that is cryptic but it's the basic concept of what you'll need to do. Since your only getting 3000 nodes with your current code, you will at best get to 12~15k without any parameters. So you need to kill the recursion completely.



回答4:

Is your Node a struct or a class? If it's the former, make it a class so that it's allocated on the heap instead of on the stack.



回答5:

I would first verify that you are actually overflowing the stack: you actually see a StackOverflowException get thrown by the runtime.

If this is indeed the case, you have a few options:

  1. Modify your recursive function so that the .NET runtime can convert it into a tail-recursive function.
  2. Modify your recursive function so that it is iterative and uses a custom data structure rather than the managed stack.

Option 1 is not always possible, and assumes that the rules the CLR uses to generate tail recursive calls will remain stable in the future. The primary benefit, is that when possible, tail recursion is actually a convenient way of writing recursive algorithms without sacrificing clarity.

Option 2 is a more work, but is not sensitive to the implementation of the CLR and can be implemented for any recursive algorithm (where tail recursion may not always be possible). Generally, you need to capture and pass state information between iterations of some loop, together with information on how to "unroll" the data structure that takes the places of the stack (typically a List<> or Stack<>). One way of unrolling recursion into iteration is through continuation passing pattern.

More resources on C# tail recursion:

Why doesn't .NET/C# optimize for tail-call recursion?

http://geekswithblogs.net/jwhitehorn/archive/2007/06/06/113060.aspx



回答6:

I would first make sure I know why I'm getting a stack overflow. Is it actually because of the recursion? The recursive method isn't putting much onto the stack. Maybe it's because of the storage of the nodes?

Also, BTW, I don't see the end parameter ever changing. That suggests it doesn't need to be a parameter, carried on each stack frame.