I'm trying to port my OpenGL 3D game engine to Vulkan. There are large numbers of 3D objects in the game scene and each has it own attributes (model matrix, lights, etc.) and the objects are completely dynamic, which means some 3D objects may come in and others may be removed during the game play. with OpenGL, I grouped 3D object's attribute into a uniform buffer in shader (code simplified):
layout(std140, set = 0, binding = 0) uniform object_attrib
{
vec3 light_pos;
vec3 light_color;
mat4 model;
mat4 view_projection;
...
} params;
What I'm trying to do now is using this single uniform buffer for every 3D objects in the game scene to render them by Vulkan.
I'm using a single Vulkan render pass, within begin-render-pass and end-render-pass, I use a for-each loop to go over each 3D objects and do the following things to render them. See the pseudo code below.
vkBeginCommandBuffer(cmdBuffer, ...);
vkCmdBeginRenderPass(cmdBuffer, ...);
for(object3D obj : scene->objects)
{
// Step 1 - update object's uniform data by memcpy()
_updateUniformBuffer(obj);
// Step 2 - build draw command for this object
// bind vertex buffer, bind index buffer, bind pipeline, ..., draw
_buildDrawCommands(obj);
}
vkCmdEndRenderPass(cmdBuffer, ...);
vkEndCommandBuffer(cmdBuffer, ...);
vkQueueSubmit(...); // Finally, submit the commands to queue to render the scene
Obviously, my solution will no work since all Vulkan commands in the buffer are executed on GPU only after vkQueueSubmit() is called. But the call to _updateUniformBuffer(obj) (by memcpy(...)) is "interleaved" with command recording and it is executed immediately and therefore the sequence are messed up and finally each object will not get its own attributes.
So may question is what the solution for Vulkan to properly update uniform buffer repeatedly for each object inside a single render pass and make sure each object get its correct attribute data?
Before I post this question, I tried to think about the following solutions but none of them seems to be a good one:
- Using render-pass-per-object and use fence to make sure one object is completely rendered until I start rendering the next one. If there are 1000 objects, there will be 1000 render pass per frame? This is impossible.
- Can I submit command buffer repeatedly inside one render pass? I means I submit command buffer right after the draw command for one object is built to render the object, use fence to make sure render is done, then go to the next object. This will have a single render pass and 1000 vkQueueSubmit() calls
- Using dynamic uniform buffer which create a huge uniform buffer contains data for 1000 objects. It is difficult to implement since the object number are dynamic.
- Using push constant? It is also impossible since the max data size is only 128 bytes.
Because you're recording the draw commands, and their input data in the form of uniforms, for all objects in the scene before any of them execute and read their input data, there is no way around having storage for all versions of the uniform buffer allocated somewhere. OpenGL ES drivers do this for you: when you update uniforms, they're internally allocating new space, writing the new uniforms into that, and then updating an internal pointer so that the next call will use the new uniform data instead of the previous uniform data.
In Vulkan, you get to do that yourself, and your third idea is closest to the right way. There's a few variations, but one of the most straightforward is:
Create a large VkBuffer and bind it to memory. It should probably be large enough to handle all of the uniform data for a typical/average frame. Starting with an offset of zero, for each draw, write the new uniforms at the current offset, re-bind a VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER_DYNAMIC in your descriptor set with the dynamic offset pointing at the new uniform data, and then update the offset so the next draw's uniforms will be placed after the ones you just used.
At the end of each frame (assuming one command buffer per frame), remember how far you got in the buffer, and associate that with the event that signals the completion of that command buffer. That event will tell you when you can overwrite the region of the buffer used in that frame. If you end up needing more space for uniforms before enough space becomes available again, you can just create a new VkBuffer and start using that, eventually coming back to the original when its data is retired. In this way, you can end up with a dynamically-sized ringbuffer of uniform data made up of multiple VkBuffers.