First of all, sorry for my English.
I know architectures are very complex and there's a broad sprectrum of situations, but a common generalization is if a computer architecture has 32-bits words, means registers, memory accesses and buses work with words of 32-bits long (but I think there's a lot of variants in current architectures).
Ok, let's suppose this is the rule and our architecture is a little-endian one, as x86. In such a case, if we want to read a short int
(2-bytes long), the memory reads then the 4-bytes word which contains our short
. Let's suppose the containing word W is 0xf1342ea0
, in memory:
{a0, 2e, 34, f1} // a0 is the byte in the lowest address.
and our half-word H is in the highest part of W, then, H is 0xf134. I understand the processor receives, from the memory, a word with the short
shifted:
{34, f1, 00, 00}
since 0x0000f134 equals 0xf134.
With this picture in mind, since the processor is 4-bytes long and it is thus neccesary by all means a shifting, why must 2-bytes data to be aligned in 2-bytes word boundaries?
In other words:
Why is encouragingly recommended not to read the short
0xf134 in the word:
{ff, 34, f1, 0a}
?
EDIT: Other way of expressing the same doubt is: why the definition of alignment is
A object of size N and address d is aligned if d is divisible by N.
and not:
A object of size N and direction d is aligned respect to an architecture
of B bytes if d is divisible by B, or ⌊d/B⌋ == ⌊(d+N)/B⌋ if N < B.
?
NOTE: The property ⌊d/B⌋ == ⌊(d+N)/B⌋ implies the object belongs to an aligned word.
If the memory is { ff, 34, f1, 0a }, then it's not a problem for an x86 processor. However, if the memory is { ff, ff, ff, 34 } {f1, aa, aa, aa }, the processor must perform two bus cycles to retrieve the value of the short. (Also note there are some RISC-based processors that do not support misaligned accesses at all.)