for my homework I need to write a very little virtual 16 bit Assembler-Interpreter in C#. It simulates the RAM with a byte-array (64k) and the registers with Variables (A,B,C,...). Now I need a way to save local variables, google says that they are allocated on the Stack.
But the thing thats unclear to me is, when they are allocated on the Stack (with push...), how is the Interpreter accessing them when they are used later?
See the following 2 lines:
pi INT 3
mov A, pi
In the first line, pi is allocated on the stack, in the second line, pi is used, but how should the Interpreter know where pi is in the stack to access its data? (my Stack is a byte-array too with 2 helper-functions (push, pop), there is also a pointer to the top of the stack)
Typically, the stack data is accessed relatively through the
stack pointer
, which is a CPU register that points to the last element stored on the stack. You may think of it as of an index into the memory of the emulated CPU. Every time you push something onto the stack, the stack pointer gets decremented by the size of that something and that something gets stored in the emulated memory at the address after the decrement. Whenever you pop something off the stack, the value is taken from the address stored in the stack pointer and then the stack pointer gets incremented by the size of that something. That's how CPU stacks work in many different CPUs.If you're implementing a CPU emulator or CPU instruction emulator/interpreter, you don't care much of variables. What you care about is the CPU instructions that manipulate CPU registers and memory because your program is expressed in terms of CPU instructions. They (the instructions) have to keep track of all loacal variables stored on the stack, that is, their location relative to the current value of the stack pointer.
For example, if you consider a simple subroutine that adds two 16-bit integer values passed to it on the stack, it could look something like this in e.g. 16-bit x86 assembly:
And the caller may look like this:
The answer is: it depends. You, as the language designer, should define, what are the visibility (if a variable name is defined, within which part of the source code is the name available?) and hiding (if there is another object with the same name defined in the visibility area of another object, which name wins?) rules of the variables. Different languages have different rules, just compare Javascript and C++.
So, I would do it this way. (1) Introduce a notion of namespace: the list of names visible at certain point of the source file. (Note that this is not the same as C++'s namespace notion.) The namespace should be able to resolve the name to some appropriate object. (2) Implement rules for changing namespaces when your interpreter changes from one procedure to another one, from one file to another one, from one block to another one, sees a declaration or end of block etc.
These steps are basically valid for most of languages, not just assembler.
(I think, Google's reference to "allocation on stack" refers to the idea of processing each subroutine in a a separate subroutine, and redefining a namespace there locally, therefore "on stack", so it will be automatically popped when the procedure finishes.)
'google says that they are allocated on the Stack'
this is how it is implemented in real computers but that is not the whole story.
If you want to a virtual interpreter you need to use a Data Structure called 'Hash Table'.
Well this is a Homework question. So no direct answer :P But the following code will explain how to use the Hash Table. Store the variable names and values in Hash Tables.
Typically there is no separate stack memory, instead the stack is in the regular RAM, so you only have the stack pointer that keeps track of it.
Also typically, local variables are allocated at the beginning of a subroutine by copying the stack pointer to another register, then moving the stack pointer to make room for the variables:
Accessing local variables is done using the copy of the stack pointer:
When you leave the subroutine, you restore the stack pointer to deallocate the local variables:
The syntax that you use in the question is usually used to declare global variables, not local variables: