I am playing around with memory addresses in C and wondering about this topic named unaligned memory access.
I am currently on a x86 Intel with Linux kernel, but ask the topic in the spirit of arch and OS agnosticism – though the following is rather Linux and HW specific:
When I read/write a simple type from/to an unaligned address I get no fault. No messages in logs or anything. I have also tried:
perf top -e alignment-faults
# And with PID
perf top -p NNN -e alignment-faults
but no hits.
Turning on alignment checking by:
__asm__("pushf\norl $0x40000,(%esp)\npopf");
gives the "wanted" result:
Bus error (core dumped)
(but still no messages in perf
.)
My question is how this is handled by the hardware + OS and what is optimal. My thoughts and questions are all over the place, but I'll try to phrase some concrete points:
- Does the CPU have alignment checking on by default, but the kernel detects that off is supported and instructs it to do not check?
- As the kernel, at least I have experienced this on other HW, can get oops due to some driver trying to access unaligned memory: does the kernel run in alignment check-mode? Or is it perhaps only certain parts of the code that does?
- As access of unaligned memory require more resources; is it a good idea to enable alignment checking, as by for example above assembly line, in a test-phase for software? Would this also make it more portable?
I have a lot of more questions around this, but leave it at this for now.
Just try to give a partial answer.
It depends on the arch and even on same arch it might be that some instructions on unaligned memory can be handled by HW while others can not.
Unaligned memory access no supported by HW would cause a trap and kernel has handler for the trap/exception. I've been working on ppc and such exception would be handled based on the instruction (gotten from the PC); some instructions are taken care of and the program would resume; some others might cause the program to terminate as kernel cannot handle it. One such example is the the stwcx instruction which is used to implement compare-and-swap logic.
In practice it might not be a good idea if you have a lot of legacy code in a large project..... but it should be good for new code.