Understanding a recursive IA32 assembly call

I'm trying to do some practice to get more comfortable with IA32 assembly, and am struggling a bit on translating this recursive snippet of assembly code to understandable C code. They gave us a hint that all functions in the code give only one argument, but my understanding of the IA32 stack is still a bit poor.

.globl bar
.type bar, @function
bar:
pushl %ebp
movl %esp, %ebp
movl 8(%ebp), %eax
addl $10, %eax
popl %ebp
ret

.globl foo
.type foo, @function
foo:
pushl %ebp
movl %esp, %ebp
subl $24, %esp
movl %ebx, -8(%ebp)
movl %esi, -4(%ebp)
movl 8(%ebp), %ebx
movl $1, %eax
cmpl $1, %ebx
jle .L5
movl %ebx, (%esp)
call bar
movl %eax, %esi
subl $1, %ebx
movl %ebx, (%esp)
call foo
imull %esi, %eax

.L5:
movl -8(%ebp), %ebx
movl -4(%ebp), %esi
movl %ebp, %esp
popl %ebp
ret

The bar function seems easy enough - it adds 10 to the argument and returns. The "foo" function is where I'm getting lost. I understand subl $24, esp reserves 24 bytes of space, but from there I start to lose myself. We seem to recursively call "foo" and decrement the argument (let's call it x)each time until it reaches 1, and then multiply the result of bar(x) by the result of bar(x-1) but everything else that's going on I don't seem to get.

Specifically, what do these two commands do? They come right after subl $24, %esp and I think that's the key to understanding the whole thing. movl %ebx, -8(%ebp) movl %esi, -4(%ebp)

In x86 the stack grows downwards (towards smaller addresses). In IA32 registers are 32-bits, or 4 bytes, and so is the Instruction Pointer (EIP). The stack pointer register (ESP) is pointing to the "top of stack", that is, the last item pushed on the stack - and because the stack grows towards lower addresses this is the lowest address with valid data on the stack. The push instruction decrements ESP by 4 then stores 32-bits (4 bytes) at that address. The call instruction has an implied push of 4 bytes (the return address, which is the address immediately following the call).

So what is at the top of stack at the entry of a function? That would be the 4 bytes pointed by ESP, and it contains the return address. What is at offset +4 from ESP? That would be the second to last thing pushed on the stack, which in the case of a function with one parameter it would be that parameter:

|                |
+----------------+
|  parameter 1   |     ESP + 4
+----------------+
| return address |     <===== ESP (top of stack)
+----------------+

In order to unwind the stack (as a debugger would want to do) the compiler builds an "EBP chain" - EBP points to a location on the stack where the caller's EBP is stored... that EBP in turn points to the saved EBP of the caller's caller, and so on up the stack. That is why at the beginning of a function you see:

pushl %ebp        # save caller's EBP on the stack
movl %esp, %ebp   # EBP now points to caller's EBP on the stack

Some functions don't have variables in automatic storage... in that case the function preamble is done... however, say you have 6 variables each 4 bytes, then you need 6x4 = 24 bytes of automatic storage on the stack... which is accomplished by subtracting 24 from ESP, you then have room for 6 local variables (local-var-0...local-var-5):

subl $24, %esp

The stack now looks like this:

|                |
+----------------+
|  parameter 1   |     8(ebp) 
+----------------+
| return address |     4(ebp)
+----------------+
| caller's EBP   |     0(ebp)
+----------------+
| local-var-0    |     -4(ebp)
+----------------+
| local-var-1    |     -8(ebp)
+----------------+
| local-var-2    |     -12(ebp)
+----------------+
| local-var-3    |     -16(ebp)
+----------------+
| local-var-4    |     -20(ebp)
+----------------+
| local-var-5    |     -24(ebp)
+----------------+

As you can see, -8(ebp) is the address of local-var-1, and -4(ebp) is the address of local-var-0, so this code saves registers on the stack

movl %ebx, -8(%ebp)    # save %ebx in local-var-1
movl %esi, -4(%ebp)    # save %esi in local-var-0

and they are restored prior to returning:

.L5:
movl -8(%ebp), %ebx  # restore %ebx from local-var-1
movl -4(%ebp), %esi  # restore %esi from local-var-0

General purpose registers %ebx and %esi (along with %edi and %ebp) are saved by the "callee" according to the IA32 calling conventions. See Table 4, Chapter 6 Register Usage of Agner Fog's Calling Conventions document.