Broadly speaking, I am wondering how the kernel (or the CPU) knows that a process has tried to access a memory location for which it lacks permission, and how the mysterious piece of hardware called MMU helps in doing that.
In particular: It seems to me that the MMU is agnostic towards the memory management model of the OS kernel (paging, memory zones, process adress spaces...) (I would presume that Linux and Windows pages are not exactly the same, for example. Correct me if I'm wrong). But then, how does my CPU find out whether the current code may access location x
? And how does it signal this to the kernel?
This is probably too big a topic to completely answer satisfactorily here; you'll do better to search for some papers/articles/books that discuss the hardware behind virtual memory implementations (probably starting with a specific architecture, since there are significant differences between e.g. x86, x86_64, sparc, etc...).
The short answer, though, is that the hardware handles this through the page tables. Every memory access that the MMU is asked to handle is verified through the page table structures. If the page table entry describing the page containing the address being requested is not marked to allow the type of access being requested (read/write/execute/...), the hardware generates a trap that Linux eventually calls a "segmentation fault". Other OSes name them differently (e.g. general protection fault, ...). The OS kernel then has to figure out the reason for the fault and whether anything can be done about it (many traps are handled by the kernel to swap in new pages from disk, map a new empty page, etc., but some, like null-pointer dereferences, the best thing the kernel can do is throw it at the application to say "you did something bad").
The MMU is configured (by design of its logic and/or option bits set by the kernel) to be the hardware part of the implementation of the paging model.
The MMU must ordinarily translate logical addresses to the mapped physical addresses; when it cannot do so because there is no corresponding physical address for the requested logical address, it generates a fault (often as a type of interrupt) which runs handler code in the kernel.
If the fault was an attempt to request something that theoretically exists - say part of a mapped file - but in not currently present in physical ram, the operating system's virtual memory implementation can solve the problem by allocating some physical ram and copying the appropriate disk blocks into it.
However, if it is a request for something that does not exist, it cannot be satisfied and will have to be handled as a program fault.
A request to write to something where writing is not allowed would be handled in a similar manner.
Off the top of my head, I'm not sure if attempts to execute non-executable information are detected in the MMU or more in the CPU itself; how an instruction cache if present fits into that could also complicate things. However, the end result would be similar - a fault condition to the kernel that an illegal execution attempt has occurred, which the kernel would typically treat as a program fault.
In summary, the model is that the simpler hardware layers tell the kernel that something unusual has happened, which the hardware cannot deal with by itself using its current configuration. The operating system then decides if what was attempted can and should occur - if so it updates the hardware configuration to make this possible. Or if what was attempted should not be permitted, a program fault is declared. And there are additional possibilities too, for example, a virtualization layer could decide to emulate the requested operation rather than literally performing it, preserving some isolation from the hardware.