I want to be sure that I understand pipeline barriers correctly. So barriers are able to synchronize two command buffers provided the source stage of the second barrier is later than the destination stage of the first barrier. Is this correct? Of course I will need to use semaphores if the command buffers execute during different iterations of the pipeline.
It seems to me that synchronisation is the hardest part to grasp in Vulkan. IMO the specification isn't clear enough about it.
The 1.0.35 Vulkan specification has improved wording that makes this clear:
Note that there is no requirement on the source or destination stage. You can synchronize with a source as fragment shader and destination as vertex shader just fine.
Preamble:
Most of what applies to Vulkan Pipeline Barriers applies to generic barriers and memory barriers, so you can start there to build your intuition.
I would note, though the specification is not a tutorial, it is reasonably clear and readable. Synchronization is perhaps the hardest part and the description in specification mirrors that. On top of that, especially memory barriers are novel to most (they are usualy shielded from such concept by higher language compiler).
Needed definitions:
Pipeline is abstract scheme of how a unit of work is processed. There are sort of four types (though Vulkan does not say vendors how to do things as long as they follow the rules):
There are special stages TOP (before anything is done), BOTTOM (after everything is finished), and ALL (which is the same as bitfield with all stages set).
(Action) command is a command that needs (one or more) passes through the pipeline. It must be recorded to command buffer (with the exception of the host reads and writes through
vkMapMemory()
).Command buffer is some sequence of commands (in recorded order!). And queue is too a sequence of recorded commands (concatenated from submited command buffers).
The queue has some leeway in which order it executes the commands (it may reorder commands as long as the user-set state is preserved) and also may overlap commands (e.g. execute VS of next command before finishing FS of previous command). User defined synchronization primitives set a boundaries to this leeway. (There are also some implicit guarantees -- but better to not rely on them and oversynchronize until confident)
My take on explaining Pipeline Barriers:
(Maybe unfortunately) the Pipeline Barriers amalgamates three separate aspects -- execution barrier, memory barrier and layout transition (if it's image).
The execution barrier part assures that all commands recorded before the Barrier reached in exececution at least the specified pipeline stage (or stages) in
srcStageMask
before any of the commands recorded after the Barrier starts executing their specified stage (or stages) indstStageMask
.It does handle only execution dependency and not memory! The memory barrier part assures that memory (caches) are properly flushed and invalidated somewhere in between that execution barrier dependency (i.e. after the depending and before the dependant commands and stages).
You provide what kind of memory dependency it is and between what kind of sources/consumers (so the driver can choose appropriate action without remembering the state itself). Typicaly write-read dependency (read-read and read-write do not need any memory synchronization and write-write does not usually make much sense -- why would you overwrite some data without reading them first).
Different data layout in memory may be advantegeous (or even necessery) on some HW. In the same time the memory dependency is handeled, the data is reordered to adhere to the new specified layout.