I am an armature in OpenGl and for this reason I am seeking to learn only modern OpenGl the 4.x stuff. Once I had completed basic tutorials (rotating cubes for example.) I decided I would try and create a voxel based program dealing solely with cubes. The goals of this program was to be fast, use limited CPU power and memory, and be dynamic so the map size can change and blocks will only be drawn if in the array it says the block is filled.
I have one VBO with the vertices and indexes of a cube built out of triangles. At the beginning if the render function I tell OpenGl the shaders to use and then bind the VBO once that is complete I execute this loop
Draw Cube Loop:
//The letter_max are the dimensions of the matrix created to store the voxel status in
// The method I use for getting and setting entries in the map are very efficient so I have not included it in this example
for(int z = -(z_max / 2); z < z_max - (z_max / 2); z++)
{
for(int y = -(y_max / 2); y < y_max - (y_max / 2); y++)
{
for(int x = -(x_max / 2); x < x_max - (x_max / 2); x++)
{
DrawCube(x, y, z);
}
}
}
Cube.c
#include "include/Project.h"
void CreateCube()
{
const Vertex VERTICES[8] =
{
{ { -.5f, -.5f, .5f, 1 }, { 0, 0, 1, 1 } },
{ { -.5f, .5f, .5f, 1 }, { 1, 0, 0, 1 } },
{ { .5f, .5f, .5f, 1 }, { 0, 1, 0, 1 } },
{ { .5f, -.5f, .5f, 1 }, { 1, 1, 0, 1 } },
{ { -.5f, -.5f, -.5f, 1 }, { 1, 1, 1, 1 } },
{ { -.5f, .5f, -.5f, 1 }, { 1, 0, 0, 1 } },
{ { .5f, .5f, -.5f, 1 }, { 1, 0, 1, 1 } },
{ { .5f, -.5f, -.5f, 1 }, { 0, 0, 1, 1 } }
};
const GLuint INDICES[36] =
{
0,2,1, 0,3,2,
4,3,0, 4,7,3,
4,1,5, 4,0,1,
3,6,2, 3,7,6,
1,6,5, 1,2,6,
7,5,6, 7,4,5
};
ShaderIds[0] = glCreateProgram();
ExitOnGLError("ERROR: Could not create the shader program");
{
ShaderIds[1] = LoadShader("FragmentShader.glsl", GL_FRAGMENT_SHADER);
ShaderIds[2] = LoadShader("VertexShader.glsl", GL_VERTEX_SHADER);
glAttachShader(ShaderIds[0], ShaderIds[1]);
glAttachShader(ShaderIds[0], ShaderIds[2]);
}
glLinkProgram(ShaderIds[0]);
ExitOnGLError("ERROR: Could not link the shader program");
ModelMatrixUniformLocation = glGetUniformLocation(ShaderIds[0], "ModelMatrix");
ViewMatrixUniformLocation = glGetUniformLocation(ShaderIds[0], "ViewMatrix");
ProjectionMatrixUniformLocation = glGetUniformLocation(ShaderIds[0], "ProjectionMatrix");
ExitOnGLError("ERROR: Could not get shader uniform locations");
glGenVertexArrays(1, &BufferIds[0]);
ExitOnGLError("ERROR: Could not generate the VAO");
glBindVertexArray(BufferIds[0]);
ExitOnGLError("ERROR: Could not bind the VAO");
glEnableVertexAttribArray(0);
glEnableVertexAttribArray(1);
ExitOnGLError("ERROR: Could not enable vertex attributes");
glGenBuffers(2, &BufferIds[1]);
ExitOnGLError("ERROR: Could not generate the buffer objects");
glBindBuffer(GL_ARRAY_BUFFER, BufferIds[1]);
glBufferData(GL_ARRAY_BUFFER, sizeof(VERTICES), VERTICES, GL_STATIC_DRAW);
ExitOnGLError("ERROR: Could not bind the VBO to the VAO");
glVertexAttribPointer(0, 4, GL_FLOAT, GL_FALSE, sizeof(VERTICES[0]), (GLvoid*)0);
glVertexAttribPointer(1, 4, GL_FLOAT, GL_FALSE, sizeof(VERTICES[0]), (GLvoid*)sizeof(VERTICES[0].Position));
ExitOnGLError("ERROR: Could not set VAO attributes");
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, BufferIds[2]);
glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(INDICES), INDICES, GL_STATIC_DRAW);
ExitOnGLError("ERROR: Could not bind the IBO to the VAO");
glBindVertexArray(0);
}
void DestroyCube()
{
glDetachShader(ShaderIds[0], ShaderIds[1]);
glDetachShader(ShaderIds[0], ShaderIds[2]);
glDeleteShader(ShaderIds[1]);
glDeleteShader(ShaderIds[2]);
glDeleteProgram(ShaderIds[0]);
ExitOnGLError("ERROR: Could not destroy the shaders");
glDeleteBuffers(2, &BufferIds[1]);
glDeleteVertexArrays(1, &BufferIds[0]);
ExitOnGLError("ERROR: Could not destroy the buffer objects");
}
void DrawCube(float x, float y, float z)
{
ModelMatrix = IDENTITY_MATRIX;
TranslateMatrix(&ModelMatrix, x, y, z);
TranslateMatrix(&ModelMatrix, MainCamera.x, MainCamera.y, MainCamera.z);
glUniformMatrix4fv(ModelMatrixUniformLocation, 1, GL_FALSE, ModelMatrix.m);
glUniformMatrix4fv(ViewMatrixUniformLocation, 1, GL_FALSE, ViewMatrix.m);
ExitOnGLError("ERROR: Could not set the shader uniforms");
glDrawElements(GL_TRIANGLES, 36, GL_UNSIGNED_INT, (GLvoid*)0);
ExitOnGLError("ERROR: Could not draw the cube");
}
The vertex shader only handles rotation and transformation of vertices and the fragment shader only deals with colour they are not expensive to run so they are not the bottleneck.
How can this code be improved to render more efficiently and take full advantage of modern OpenGL features to decrease overhead?
P.S. I am not looking for a book or a tool or an off-site resource as an answer I have used backface culling and the OpenGL depth test to try and improve speed however they haven't made a dramatic difference it is still taking ~50ms to render a frame and that is too much for a voxel grid of 32*32*32.
Here screenshot of what I am doing:
And here link to full code:
That is because you do this in the wrong way. You are calling
32^3
times some functionDrawCube
which is too big overhead (especially if it changes the matrices). That takes more likely much much more time than the rendering itself. You should pass all the rendering stuff at once if possible for example as a texture array or VBO with all the cubes.You should do all the stuff inside shaders (even the cubes ...).
You did not specify what technique you want to use for rendering of your volume. There are many options here some that are usually used:
Are your cubes transparent or solid? If solid why are you rendering
32^3
cubes instead of only the visible~32^2
? There are ways on how to select only visible cubes before rendering ...My best bet would be to use ray-tracing and rendering inside fragment shader (no cube meshes just inside cube test). But for starters the easier to implement would be to use VBO with all the cubes inside as mesh. You can also have just points in the VBO and emit cubes in the geometry shader latter....
Here some collection of related QAs of mine that could help with each of the technique...
Ray tracing
sphere()
functionVolume ray tracer is magnitude simpler than mesh raytrace.
Cross section
This is also a magnitude simpler for volume and in 3D ...
If you need some start point for GLSL take a look at this:
[Edit1] GLSL example
Well I manage to bust very simplified example of GLSL volumetric ray tracing without refractions or reflections. The idea is to cast a ray for each pixel of camera in vertex shader and test which volume grid cell and side of voxel cube it hit inside fragment shader. To pass the volume I used
GL_TEXTURE_3D
without mipmaps and withGL_NEAREST
fors,t,r
. This is how it looks like:I encapsulated The CPU side code to this C++/VCL code:
Volume is initiated and used like this:
the
vol.glsl_draw()
renders the stuff... Do not forget to callgl_exit
before shutdown of app.Here Vertex shader:
And Fragment:
As you can see it is very similar to the Mesh Raytracer I linked above (it was done from it). The ray tracer is simply this Doom technique ported to 3D.
I used my own engine and VCL so you need to port it to your environment (
AnsiString
strings and shader loading/compiling/linking andlist<>
) for more info see the simple GL... link. Also I mix old GL 1.0 and core GLSL stuff which is not recommended (I wanted to keep it as simple as I could) so you should convert the singleQuad
to VBO.the
glsl_draw()
requires the shaders are linked and binded already whereShaderProgram
is the id of the shaders.The volume is mapped from
(0.0,0.0,0.0)
to(1.0,1.0,1.0)
. Camera is in form of direct matrixtm_eye
. Thereper
class is just mine 4x4 transform matrix holding both directrep
and inverseinv
matrix something like GLM.Volume resolution is set in
gl_init()
hardcoded to32x32x32
so just change the linei=32
to what you need.The code is not optimized nor heavily tested but looks like it works. The timings in the screenshot tells not much as there is huge overhead during the runtime as I have this as a part of larger app. Only the
tim
value is more or less reliable but does not change much with bigger resolutions (probably till some bottleneck is hit like memory size or resolution of screen vs. frame rate) Here screenshot of the whole app (so you have an idea what else is running):If you are doing separate draw calls and invoking shader execution for each specific cube that is going to be a massive perf loss. I would definitely recommend instancing - this way your code can have a single draw call and all cubes will be rendered.
Look up documentation for glDrawElementsInstanced, however this approach also means that you will have to have a "buffer" of matrices, one for each voxel cube, and will have to access each one in the shader using gl_InstanceID to index into the correct matrix.
Regarding the depth buffer, there will be savings on your rendering if the cube matrices are somehow sorted front - to - back from the camera so there is the performance benefit of an early-z depth test fail for any possible fragment that lies behind an already-rendered voxel cube.