segmentation fault for deep recursion upon going f

I have simple recursive program to find connected subgraphs. The code works by traversing, from each unvisited vertex in the graph, all the edges (recursively) and marking those which have been visited with 'ingraph=False' in the process. (the graphs are always undirected unweighted graphs).

The problem I have is that for large graphs (with a subgraph of ~100,000 vertices) python(-2.7) returns a segmentation fault. But this worked just fine in python-2.6 (and still does).

Can someone explain to me what has changed between two (or maybe it's something else entirely)? Is there a way to fix it with python-2.7 (which preferably also doesn't break upon migrating to python-3 at some point)? Or should I rewrite it in a non-recursive way?

Thanks!

here's the source update: see update 3 below for a nonrecursive solution

def floodfill(g):
  for v in g.vertices: v.ingraph = True
  sys.setrecursionlimit(g.size)
  subgraph = 1
  for v in g.vertices:
    if v.ingraph:
      v.ingraph = False
      v.subgraph = subgraph
      g.floodfill_search(v,subgraph)
      subgraph += 1

def floodfill_search(g,v,subgraph):
  for n in v.neighbors:
    if n.ingraph:
      n.ingraph = False
      n.subgraph = subgraph
      g.floodfill_search(n,subgraph)

------ UPDATE --------

I made a small recursion test which gives recursions limit of ~16,000, ~24,000 and ~28,000 for 3 different computers. Moreover the result is not even constant for one pc. Running the test twice gives slightly different limits. For example for the second I find results between 23800 and 23819.

#!/usr/bin/python
import sys
def callme(i):
  print(i)
  i+=1
  callme(i)

sys.setrecursionlimit(int(1e6))
callme(0)

I don't really get which 'C stack' is referred to, as far I can tell there is no default 'stack' implemented in C. In C++ there are stacks but it doesn't have the same limitations. The following C++ example runs fine (at least up to 1M pushes).

#include <iostream>   
#include <stack>
using namespace std;

int main () {
  stack<int> s;
  for(int i=0;i<1000000;i++) {
    cout << "push " << i << endl;
    s.push(i);
  }
}

The following C code also goes much deeper (about 10x, ~262,000)

#include "stdio.h"
void callme(i) { 
  printf("calling %d\n",i);
  callme(++i);
}

int main() {
  int i=0;
  callme(i);
}

---- UPDATE 2 -----

Ok, this is python's intention apparently. Forcing programmers to avoid deep recursion. http://neopythonic.blogspot.ch/2009/04/tail-recursion-elimination.html

In any case I now think it is better to rewrite it iteratively. But then I will probably start over in C++, using some graph theory library like the boost graph library. If I am going to have to rewrite it anyway, I might as well do it properly.

Nevertheless, I would still appreciate any comments to understand why this happens at these specific sizes.

---- UPDATE 3 -----

Here's at least a quick and dirty python rewrite. Dirty because it's O(N^2) because of the last line. There should be a nicer O(N) solution by keeping track a list of which vertices have not been visited, but didn't see it so quickly and this works for my applications.

def floodfill_nonrecursive(g):
    for v in g.vertices: v.ingraph = True
    start = g.vertices
    subg = 1
    while start:
      q = [start[0]]
      while q:
        v = q.pop()
        v.ingraph = False
        v.subgraph = subg
        for n in v.neighbors:
          if n.ingraph:
            n.ingraph = False
            q.append(n)
      subg += 1
      start = [v for v in g.vertices if v.ingraph]

回答1:

because your python use c stack, it's overflowed. setrecursionlimit can't extend cstack size. it just make limit to raise exception before cstack overflow. python's recursion has limited depth. success in 2.6 is just lucky case.

you should change your code from recursion to iterative style or use stackless python(or pypy) read docs.python.org/2/library/sys.html#sys.setrecursionlimit

回答2:

You probably overflow stack with deep recursion somewhere in Python implementation. You may try changing stack dept with sys.setrecursionlimit

Another possibility is that you exhaust dynamic memory. Recursion is normally more taxing.

You had more luck with Python 2.6. Previous version required less memory for your algorithm.

Python isn't a functional language and doesn't optimise tail recursion. Rewriting the algorithm iteratively may be a better approach.