Register addressing mode vs Direct addressing mode

2019-04-09 22:17发布

问题:

I encountered this question in a test paper. It stated, Which of the given addressing modes is faster? Why?

  1. Register addressing mode
  2. Direct addressing mode

Now according to me register addressing mode should be faster as register is the fastest memory location in the computer. Is this the correct answer to it?

Please help out. Thanks

回答1:

Register accesses are the fastest. However, memory accesses can be as fast if the memory data that you're accessing is already in the CPU's data cache.



回答2:

The difference between the two addressing modes is...the source of the address...For direct addressing mode the address of the item to be accessed is an immediate encoded in the instruction, so the instruction is larger, in some cases much larger so it requires more clock cycles to access, ideally it is in the cache as it is the bytes immediately following the opcode and the fetching of the opcode normally causes at least a cache line behind it to be fetched, with anything but the oldest x86 platforms I dont see how you would get to where you are executing the instruction without the rest of the instruction and the next few/many instructions already fetched and in the pipe. Even old x86 processors had a prefetch queue of some size.

Register addressing means the address for the item being accessed is in a register. Assuming the address was already there, then this is faster because you dont incur the larger instruction, extra cycles, more of the cache line burned for instruction. Where you have to be careful with this argument is say for example the instruction just before is loading the immediate address into the register.

mov ax,[1000h]


mov ax,[bx]

The second one is faster than the first (for things that can be compared at this level), because of the instruction size and additional cache burned and cycles take.

BUT

mov ax,[1000h]


mov bx,1000h    
mov ax,[bx]

the direct addressing is faster because overall it takes fewer cycles to fetch and execute (for things that can be compared).

What do I mean by for things that can be compared? The addressing mode has to do with where the address comes FROM. once you start to EXECUTE that instruction and perform a memory cycle then the two are equal, it is an address on a bus, to be comparing the two instructions the data size is the same. it may very well be the case that direct adddressing is faster for some test program simply because for that test program the data is always in the data cache, where for that test program the register addressing version is not or sometimes is not. So the things that can be compared between the two instructions are the size of the instruction, which leads to the cycles and cache line it burns. One cache line can hold many register based instructions but only a few direct/immediate based instructions, so by using direct/immediate you have an opportunity cost and incur more memory cycles overall when executing the program. YES, many of these cycles are in parallel on anything remotely modern.

So these types of questions have to do with whether or not you understand the instruction set, and depending on how much detail you return, whether or not you understand beyond that what the actual costs are. Likewise without experience, simply trying an experiment will likely fail or show no difference as you have to craft the experiment around the caches.

I highly recommend the book The Zen of Assembly Language by Michael Abrash

http://www.amazon.com/Zen-Assembly-Language-Knowledge-Programming/dp/0673386023/ref=sr_1_1?ie=UTF8&qid=1335971069&sr=8-1

NOT the free one that comes with the big black graphics programming book that one is incomplete. You can get a used copy in good shape (bought a second one and it was better than my original that I bought at the store and has lived on a bookshelf). The details about 8088 and 8086 were outdated when the book went to print, that is not the importance of the book, the importance is to understand how to attack the problem, how to think about the problem and get an elementary insight as to what is going on behind the scenes. It is significantly more complicated today, still understandable, but I recommend starting with a foundation like this before jumping into what you see today. Esp with x86 (I highly recommend learning something, anything, other than x86 first when you start looking at busses and caching, etc). http://github.com/dwelch67/amber_samples. I have cleaned up and made the amber processor (arm2 clone) available using open source tools so that you can see things running inside the processor. One version of the amber has a cache. Again a stepping stone, adding mmus and multi cores, etc just adds to the complexity.

Super short answer, the direct addressing encodes using a longer instruction, more cycles than register addressing when only the two instructions are compared to each other. Memory side effects, caching, etc can confuse or neutralize the differences.