We had a task, given was an assembler instruction of a 2-addressing
machine:
mov 202, 100[r1+]
Note down a minimal assembler instruction sequence which replaces this
instruction (see above)
where
The addresses we are supposed to use are:
rx
- register direct addressing
[rx]
- register indirect addressing
#n
- directly addressing
And we are only allowed to use add, sub, mov.
I especially don't understand what # stands for and why we need substraction, well actually I don't really understand anything...
The task has been solved so I'm not looking for a solution. I need an explanation how it's done, trying to understand it myself but it didn't work.
I hope someone will be able and help me, would be so nice of you!
Solution:
add #100, r1
mov #202, r2
mov[r2],[r1]
sub #99, r1
Understand that assembly language is not some universal standardized thing. It is officially defined by the program that reads it which is the assembler. So whatever that code defines as the language for that assembler is the language. The machine code is what matters to the processor so you can have as many different assemblers and assembly languages as you have users so long as they all produce the right machine code. So if you are expecting this to be universal one rule applies to all kind of thing that is not the case.
You have provided enough information though for experienced folks to see the habits of other assembly languages reflected here.
mov 202, 100[r1+]
So this appears to be moving what is at address of 202 register direct addressing as you stated. to the location at address r1+100 register indexed and the post increment.
To replace that with more instructions since the one line is the simplest one as far as lines of code goes (as far as complexity and number of clocks not necessarily).
- So you need to read the contents of address 202,
- you need to add 100 to r1 temporarily,
- you need to write the contents of address 202 to the location r1+100,
- and then you need to take r1 and increment it (not the r1+100 incremented but r1 without the index incremented).
The solution given does pretty much that:
add #100, r1
mov #202, r2
mov[r2],[r1]
sub #99, r1
It adds 100 to r1 which we need to do temporarily (and will have to undo later because r1 is now wrong in the long run).
Then because they limit the addressing modes you need to have a register hold the address 202, so the value 202 is loaded into r2 just like the value 100 was added to r1. The #number means just use this number.
Now you are allowed to use [rn] so the move reads what is at address r2 (address 202) and writes it to what is at address r1 (the original r1 plus 100). Lastly because we want r1 to end up being the original plus 1 but we made it the original plus 100 we need to subtract off 99 r1+100-99 = r1+1
In C, it would be something like this:
unsigned *r1;
unsigned *r2;
//r1 is set somewhere in here to something that is not defined in your question or problem.
r1 += 100; //add #100,r1
r2 = 202; //mov #202,r2
*r1 = *r2; //mov [r2],[r1]
r1 -= 99; //sub #99,r1
having the destination on the right and operand on the left is not intuitive since we mostly write and code with the result on the left and the operands on the right.
We don't code 202 = r2;
we instead write r2 = 202;
so mov r2,#202
is more intuitive, but again assembly language is defined by the the folks that wrote the assembler and some folks like it left to right and others right to left.
In mov 202, 100[r1+]
:
- The instruction is a
mov
- The source operand is the memory at address
202
. In some assembly languages, you could also write [202]
for clarity. There's no #
on the number, so it's an absolute addressing mode, not an immediate.
- The destination operand is the memory at address
r1 + 100
.
- The side-effect of the addressing mode is
r1++
Like many flavours of assembly language, this one apparently uses #
prefixes to indicate immediate constants. This is like GNU/AT&T x86 syntax, where add $4, %eax
adds 4, but add 4, %eax
loads a value from the absolute address 4
.
Your professor's given equivalent sequence adds 100 to r1
, then subtracts 99
, since it wants to avoid displacements and post-increments in addressing modes. It also clobbers r2
, which the single-instruction version doesn't, so it's not a drop-in replacement.
You solution does not map to the input at all. In your input, R1 is in an unknown state. In our "solution" R1 ends up at in a known state (1).
In addition, you'd have to know what your nation instruction format is.
mov 202, 100[r1+]
looks like it would
1) add 100 to the contents of R1 to create an address.
2) move 202 into that address
3) Add 2 to r1 (assuming a 2-byte machine word)