As the title explains I have a very fundamental programming question which I have just not been able to grok yet. Filtering out all of the (extremely clever) "In order to understand recursion, you must first understand recursion." replies from various online threads I still am not quite getting it.
Understanding that when faced with not knowing what we don't know, we can tend to ask the wrong questions or ask the right questions incorrectly I will share what I "think" my question is in hopes that someone with a similar outlook can share some bit of knowledge that will help turn on the recursive light bulb for me!
Here is the function (the syntax is written in Swift):
func sumInts(a: Int, b: Int) -> Int {
if (a > b) {
return 0
} else {
return a + sumInts(a: a + 1, b: b)
}
}
We'll use 2 and 5 as our arguments:
println(sumInts(a: 2, b: 5))
Obviously the answer is 14. But I'm not clear on how that value is achieved.
These are my 2 hangups:
The function is called recursively until a condition is met. That condition is a > b. When this condition is met, return 0. At first glance, I would expect the return value to be 0 which is obviously incorrect.
Printing out the value of 'a' on each iteration yields a value which I would expect: 2, 3, 4, 5 (at which point 5+1 > b which meets the first condition: a > b) but I still don't see how the value of 14 is achieved.
My first thought is that something similar to the following is happening magically:
var answer = a;
answer += a+1 until a > b;
return answer;
So ruling out magic, I'm just not getting something. I would love to understand what's happening more than just implicitly.
If someone could kindly explain what technically happens during this kind of function and why the result isn't 0 and how, eventually, a + sumInts(a: a + 1, b: b) = 14
, I would be forever in your debt.
There are already a lot of good answers. Still I am giving a try.
When called, a function get a memory-space allotted, which is stacked upon the memory-space of the caller function. In this memory-space, the function keeps the parameters passed to it, the variables and their values. This memory-space vanishes along with the ending return call of the function. As the idea of stack goes, the memory-space of the caller function now becomes active.
For recursive calls, the same function gets multiple memory-space stacked one upon another. That's all. The simple idea of how stack works in memory of a computer should get you through the idea of how recursion happens in implementation.
Recursion is a tricky topic to understand and I don't think I can fully do it justice here. Instead, I'll try to focus on the particular piece of code you have here and try to describe both the intuition for why the solution works and the mechanics of how the code computes its result.
The code you've given here solves the following problem: you want to know the sum of all the integers from a to b, inclusive. For your example, you want the sum of the numbers from 2 to 5, inclusive, which is
When trying to solve a problem recursively, one of the first steps should be to figure out how to break the problem down into a smaller problem with the same structure. So suppose that you wanted to sum up the numbers from 2 to 5, inclusive. One way to simplify this is to notice that the above sum can be rewritten as
Here, (3 + 4 + 5) happens to be the sum of all the integers between 3 and 5, inclusive. In other words, if you want to know the sum of all the integers between 2 and 5, start by computing the sum of all the integers between 3 and 5, then add 2.
So how do you compute the sum of all the integers between 3 and 5, inclusive? Well, that sum is
which can be thought of instead as
Here, (4 + 5) is the sum of all the integers between 4 and 5, inclusive. So, if you wanted to compute the sum of all the numbers between 3 and 5, inclusive, you'd compute the sum of all the integers between 4 and 5, then add 3.
There's a pattern here! If you want to compute the sum of the integers between a and b, inclusive, you can do the following. First, compute the sum of the integers between a + 1 and b, inclusive. Next, add a to that total. You'll notice that "compute the sum of the integers between a + 1 and b, inclusive" happens to be pretty much the same sort of problem we're already trying to solve, but with slightly different parameters. Rather than computing from a to b, inclusive, we're computing from a + 1 to b, inclusive. That's the recursive step - to solve the bigger problem ("sum from a to b, inclusive"), we reduce the problem to a smaller version of itself ("sum from a + 1 to b, inclusive.").
If you take a look at the code you have above, you'll notice that there's this step in it:
This code is simply a translation of the above logic - if you want to sum from a to b, inclusive, start by summing a + 1 to b, inclusive (that's the recursive call to
sumInt
s), then adda
.Of course, by itself this approach won't actually work. For example, how would you compute the sum of all the integers between 5 and 5 inclusive? Well, using our current logic, you'd compute the sum of all the integers between 6 and 5, inclusive, then add 5. So how do you compute the sum of all the integers between 6 and 5, inclusive? Well, using our current logic, you'd compute the sum of all the integers between 7 and 5, inclusive, then add 6. You'll notice a problem here - this just keeps on going and going!
In recursive problem solving, there needs to be some way to stop simplifying the problem and instead just go solve it directly. Typically, you'd find a simple case where the answer can be determined immediately, then structure your solution to solve simple cases directly when they arise. This is typically called a base case or a recursive basis.
So what's the base case in this particular problem? When you're summing up integers from a to b, inclusive, if a happens to be bigger than b, then the answer is 0 - there aren't any numbers in the range! Therefore, we'll structure our solution as follows:
Now, compare this pseudocode to your actual code:
Notice that there's almost exactly a one-to-one map between the solution outlined in pseudocode and this actual code. The first step is the base case - in the event that you ask for the sum of an empty range of numbers, you get 0. Otherwise, compute the sum between a + 1 and b, then go add a.
So far, I've given just a high-level idea behind the code. But you had two other, very good questions. First, why doesn't this always return 0, given that the function says to return 0 if a > b? Second, where does the 14 actually come from? Let's look at these in turn.
Let's try a very, very simple case. What happens if you call
sumInts(6, 5)
? In this case, tracing through the code, you see that the function just returns 0. That's the right thing to do, to - there aren't any numbers in the range. Now, try something harder. What happens when you callsumInts(5, 5)
? Well, here's what happens:sumInts(5, 5)
. We fall into theelse
branch, which return the value of `a + sumInts(6, 5).sumInts(5, 5)
to determine whatsumInts(6, 5)
is, we need to pause what we're doing and make a call tosumInts(6, 5)
.sumInts(6, 5)
gets called. It enters theif
branch and returns0
. However, this instance ofsumInts
was called bysumInts(5, 5)
, so the return value is communicated back tosumInts(5, 5)
, not to the top-level caller.sumInts(5, 5)
now can compute5 + sumInts(6, 5)
to get back5
. It then returns it to the top-level caller.Notice how the value 5 was formed here. We started off with one active call to
sumInts
. That fired off another recursive call, and the value returned by that call communicated the information back tosumInts(5, 5)
. The call tosumInts(5, 5)
then in turn did some computation and returned a value back to the caller.If you try this with
sumInts(4, 5)
, here's what will happen:sumInts(4, 5)
tries to return4 + sumInts(5, 5)
. To do that, it callssumInts(5, 5)
.sumInts(5, 5)
tries to return5 + sumInts(6, 5)
. To do that, it callssumInts(6, 5)
.sumInts(6, 5)
returns 0 back tosumInts(5, 5).</li> <li>
sumInts(5, 5)now has a value for
sumInts(6, 5), namely 0. It then returns
5 + 0 = 5`.sumInts(4, 5)
now has a value forsumInts(5, 5)
, namely 5. It then returns4 + 5 = 9
.In other words, the value that's returned is formed by summing up values one at a time, each time taking one value returned by a particular recursive call to
sumInts
and adding on the current value ofa
. When the recursion bottoms out, the deepest call returns 0. However, that value doesn't immediately exit the recursive call chain; instead, it just hands the value back to the recursive call one layer above it. In that way, each recursive call just adds in one more number and returns it higher up in the chain, culminating with the overall summation. As an exercise, try tracing this out forsumInts(2, 5)
, which is what you wanted to begin with.Hope this helps!
Let me tell you with an example of Fibonacci series, Fibonacci is
so let see how recursion works, I just replace
n
int(n)
withn-1
and so on. it looks:we know if
t(0)=(n-k)
equals to1
thenn-k=0
son=k
we replacek
withn
:if we omit
n-n
then:so
3+2+1+(n-1)+n
is natural number. it calculates asΣ3+2+1+(n-1)+n = n(n+1)/2 => n²+n/2
the result for fib is :
O(1 + n²) = O(n²)
This the best way to understand recursive relation
You might be interested in Nisan and Schocken's implementation of functions. The linked pdf is part of a free online course. It describes the second part of a virtual machine implementation in which the student should write a virtual-machine-language-to-machine-language compiler. The function implementation they propose is capable of recursion because it is stack-based.
To introduce you to the function implementation: Consider the following virtual machine code:
If Swift compiled to this virtual machine language, then the following block of Swift code:
would compile down to
The virtual machine language is designed around a global stack.
push constant n
pushes an integer onto this global stack.After executing lines 1 and 2, the stack looks like:
256
and257
are memory addresses.call mult
pushes the return line number (3) onto the stack and allocates space for the function's local variables....and it goes-to the label
function mult
. The code insidemult
is executed. As a result of executing that code we compute the product of 2 and 3, which is stored in the function's 0th local variable.Just before
return
ing from mult, you will notice the line:We will push the product onto the stack.
When we return, the following happens:
After returning we are ready to execute line 4, and our stack looks like this:
Now we push 4 onto the stack.
sub
is a primitive function of the virtual machine language. It takes two arguments and returns its result in the usual address: that of the 0th argument.Now we have
Now that you know how a function call works, it is relatively simple to understand how recursion works. No magic, just a stack.
I have implemented your
sumInts
function in this virtual machine language:Now I will call it:
The code executes and we get all the way to the stopping point where
lte
returnsfalse
. This is what the stack looks like at this point:Now let's "unwind" our recursion.
return
0 and goto line 15 and advance.Line 16:
add
Line 17:
return
5 and goto line 15 and advance.Line 16:
add
Line 17:
return
9 and goto line 15 and advance.Line 16:
add
Line 17:
return
12 and goto line 15 and advance.Line 16:
add
Line 17:
return
14 and goto line 21 and advance.There you have it. Recursion: Glorified
goto
.Recursion. In Computer Science recursion is covered in depth under the topic of Finite Automata.
In its simplest form it is a self reference. For example, saying that "my car is a car" is a recursive statement. The problem is that the statement is an infinite recursion in that it will never end. The definition in the statement of a "car" is that it is a "car" so it may be substituted. However, there is no end because in the case of substitution, it still becomes "my car is a car".
This could be different if the statement were "my car is a bentley. my car is blue." In which case the substitution in the second situation for car could be "bentley" resulting in "my bentley is blue". These types of substitutions are mathematically explained in Computer Science through Context-Free Grammars.
The actual substitution is a production rule. Given that the statement is represented by S and that car is a variable which can be a "bentley" this statement can be recursively reconstructed.
This can be constructed in multiple ways, as each
|
means there is a choice.S
can be replaced by any one of those choices, and S always starts empty. Theε
means to terminate the production. Just asS
can be replaced, so can other variables (there is only one and it isC
which would represent "bentley").So starting with
S
being empty, and replacing it with the first choice"my"S
S
becomesS
can still be substituted as it represents a variable. We could choose "my" again, or ε to end it, but lets continue making our original statement. We choose the space which meansS
is replaced with" "S
Next lets choose C
And C only has one choice for replacement
And the space again for S
And so on
"my bentley is"S
,"my bentley is "S
,"my bentley is blue"S
,"my bentley is blue"
(replacing S for ε ends the production) and we have recursively built our statement "my bentley is blue".Think of recursion as these productions and replacements. Each step in the process replaces its predecessor in order to produce the end result. In the exact example of the recursive sum from 2 to 5, you end up with the production
This becomes
One really good tip I came across in learning and really understanding recursion is to spend some time learning a language that doesn't have any form of loop construct other than via recursion. That way you'll get a great feel for how to USE recursion via practice.
I followed http://www.htdp.org/ which, as well as being a Scheme tutorial, is also a great introduction on how to design programs in terms of the architecture and design.
But basically, you need to invest some time. Without a 'firm' grasp of recursion certain algorithms, such as backtracking, will always seem 'hard' or even 'magic' to you. So, persevere. :-D
I hope this helps and Good Luck!