How to best write a voxel engine in C with perform

2019-01-03 03:31发布

I am an armature in OpenGl and for this reason I am seeking to learn only modern OpenGl the 4.x stuff. Once I had completed basic tutorials (rotating cubes for example.) I decided I would try and create a voxel based program dealing solely with cubes. The goals of this program was to be fast, use limited CPU power and memory, and be dynamic so the map size can change and blocks will only be drawn if in the array it says the block is filled.

I have one VBO with the vertices and indexes of a cube built out of triangles. At the beginning if the render function I tell OpenGl the shaders to use and then bind the VBO once that is complete I execute this loop

Draw Cube Loop:

//The letter_max are the dimensions of the matrix created to store the voxel status in
// The method I use for getting and setting entries in the map are very efficient so I have not included it in this example
for(int z = -(z_max / 2); z < z_max - (z_max / 2); z++)
{
    for(int y = -(y_max / 2); y < y_max - (y_max / 2); y++)
    {
        for(int x = -(x_max / 2); x < x_max - (x_max / 2); x++)
        {
            DrawCube(x, y, z);
        }
    }
} 

Cube.c

#include "include/Project.h"

void CreateCube()
{
    const Vertex VERTICES[8] =
    {
    { { -.5f, -.5f,  .5f, 1 }, { 0, 0, 1, 1 } },
    { { -.5f,  .5f,  .5f, 1 }, { 1, 0, 0, 1 } },
    { {  .5f,  .5f,  .5f, 1 }, { 0, 1, 0, 1 } },
    { {  .5f, -.5f,  .5f, 1 }, { 1, 1, 0, 1 } },
    { { -.5f, -.5f, -.5f, 1 }, { 1, 1, 1, 1 } },
    { { -.5f,  .5f, -.5f, 1 }, { 1, 0, 0, 1 } },
    { {  .5f,  .5f, -.5f, 1 }, { 1, 0, 1, 1 } },
    { {  .5f, -.5f, -.5f, 1 }, { 0, 0, 1, 1 } }
    };

    const GLuint INDICES[36] =
    {
    0,2,1,  0,3,2,
    4,3,0,  4,7,3,
    4,1,5,  4,0,1,
    3,6,2,  3,7,6,
    1,6,5,  1,2,6,
    7,5,6,  7,4,5
    };

    ShaderIds[0] = glCreateProgram();
    ExitOnGLError("ERROR: Could not create the shader program");
    {
    ShaderIds[1] = LoadShader("FragmentShader.glsl", GL_FRAGMENT_SHADER);
    ShaderIds[2] = LoadShader("VertexShader.glsl", GL_VERTEX_SHADER);
    glAttachShader(ShaderIds[0], ShaderIds[1]);
    glAttachShader(ShaderIds[0], ShaderIds[2]);
    }
    glLinkProgram(ShaderIds[0]);
    ExitOnGLError("ERROR: Could not link the shader program");

    ModelMatrixUniformLocation = glGetUniformLocation(ShaderIds[0], "ModelMatrix");
    ViewMatrixUniformLocation = glGetUniformLocation(ShaderIds[0], "ViewMatrix");
    ProjectionMatrixUniformLocation = glGetUniformLocation(ShaderIds[0], "ProjectionMatrix");
    ExitOnGLError("ERROR: Could not get shader uniform locations");

    glGenVertexArrays(1, &BufferIds[0]);
    ExitOnGLError("ERROR: Could not generate the VAO");
    glBindVertexArray(BufferIds[0]);
    ExitOnGLError("ERROR: Could not bind the VAO");

    glEnableVertexAttribArray(0);
    glEnableVertexAttribArray(1);
    ExitOnGLError("ERROR: Could not enable vertex attributes");

    glGenBuffers(2, &BufferIds[1]);
    ExitOnGLError("ERROR: Could not generate the buffer objects");

    glBindBuffer(GL_ARRAY_BUFFER, BufferIds[1]);
    glBufferData(GL_ARRAY_BUFFER, sizeof(VERTICES), VERTICES, GL_STATIC_DRAW);
    ExitOnGLError("ERROR: Could not bind the VBO to the VAO");

    glVertexAttribPointer(0, 4, GL_FLOAT, GL_FALSE, sizeof(VERTICES[0]), (GLvoid*)0);
    glVertexAttribPointer(1, 4, GL_FLOAT, GL_FALSE, sizeof(VERTICES[0]), (GLvoid*)sizeof(VERTICES[0].Position));
    ExitOnGLError("ERROR: Could not set VAO attributes");

    glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, BufferIds[2]);
    glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(INDICES), INDICES, GL_STATIC_DRAW);
    ExitOnGLError("ERROR: Could not bind the IBO to the VAO");

    glBindVertexArray(0);
}

void DestroyCube()
{
    glDetachShader(ShaderIds[0], ShaderIds[1]);
    glDetachShader(ShaderIds[0], ShaderIds[2]);
    glDeleteShader(ShaderIds[1]);
    glDeleteShader(ShaderIds[2]);
    glDeleteProgram(ShaderIds[0]);
    ExitOnGLError("ERROR: Could not destroy the shaders");

    glDeleteBuffers(2, &BufferIds[1]);
    glDeleteVertexArrays(1, &BufferIds[0]);
    ExitOnGLError("ERROR: Could not destroy the buffer objects");
}

void DrawCube(float x, float y, float z)
{
    ModelMatrix = IDENTITY_MATRIX;

    TranslateMatrix(&ModelMatrix, x, y, z);
    TranslateMatrix(&ModelMatrix, MainCamera.x, MainCamera.y, MainCamera.z);

    glUniformMatrix4fv(ModelMatrixUniformLocation, 1, GL_FALSE, ModelMatrix.m);
    glUniformMatrix4fv(ViewMatrixUniformLocation, 1, GL_FALSE, ViewMatrix.m);
    ExitOnGLError("ERROR: Could not set the shader uniforms");


    glDrawElements(GL_TRIANGLES, 36, GL_UNSIGNED_INT, (GLvoid*)0);
    ExitOnGLError("ERROR: Could not draw the cube");
}

The vertex shader only handles rotation and transformation of vertices and the fragment shader only deals with colour they are not expensive to run so they are not the bottleneck.

How can this code be improved to render more efficiently and take full advantage of modern OpenGL features to decrease overhead?

P.S. I am not looking for a book or a tool or an off-site resource as an answer I have used backface culling and the OpenGL depth test to try and improve speed however they haven't made a dramatic difference it is still taking ~50ms to render a frame and that is too much for a voxel grid of 32*32*32.

Here screenshot of what I am doing:

img

And here link to full code:

2条回答
老娘就宠你
2楼-- · 2019-01-03 03:57

That is because you do this in the wrong way. You are calling 32^3 times some function DrawCube which is too big overhead (especially if it changes the matrices). That takes more likely much much more time than the rendering itself. You should pass all the rendering stuff at once if possible for example as a texture array or VBO with all the cubes.

You should do all the stuff inside shaders (even the cubes ...).

You did not specify what technique you want to use for rendering of your volume. There are many options here some that are usually used:

  • Ray tracing
  • Cross section
  • Sub surface scattering

Are your cubes transparent or solid? If solid why are you rendering 32^3 cubes instead of only the visible ~32^2 ? There are ways on how to select only visible cubes before rendering ...

My best bet would be to use ray-tracing and rendering inside fragment shader (no cube meshes just inside cube test). But for starters the easier to implement would be to use VBO with all the cubes inside as mesh. You can also have just points in the VBO and emit cubes in the geometry shader latter....

Here some collection of related QAs of mine that could help with each of the technique...

Ray tracing

Volume ray tracer is magnitude simpler than mesh raytrace.

Cross section

This is also a magnitude simpler for volume and in 3D ...

If you need some start point for GLSL take a look at this:

[Edit1] GLSL example

Well I manage to bust very simplified example of GLSL volumetric ray tracing without refractions or reflections. The idea is to cast a ray for each pixel of camera in vertex shader and test which volume grid cell and side of voxel cube it hit inside fragment shader. To pass the volume I used GL_TEXTURE_3D without mipmaps and with GL_NEAREST for s,t,r. This is how it looks like:

screenshot

I encapsulated The CPU side code to this C++/VCL code:

//---------------------------------------------------------------------------
//--- GLSL Raytrace system ver: 1.000 ---------------------------------------
//---------------------------------------------------------------------------
#ifndef _raytrace_volume_h
#define _raytrace_volume_h
//---------------------------------------------------------------------------
const GLuint _empty_voxel=0x00000000;
class volume
    {
public:
    bool _init;             // has been initiated ?
    GLuint txrvol;          // volume texture at GPU side
    GLuint size,size2,size3;// volume size [voxel] and its powers
    GLuint ***data,*pdata;  // volume 3D texture at CPU side
    reper eye;
    float aspect,focal_length;

    volume()    { _init=false; txrvol=-1; size=0; data=NULL; aspect=1.0; focal_length=1.0; }
    volume(volume& a)   { *this=a; }
    ~volume()   { gl_exit(); }
    volume* operator = (const volume *a) { *this=*a; return this; }
    //volume* operator = (const volume &a) { ...copy... return this; }

    // init/exit
    void gl_init();
    void gl_exit();

    // render
    void gl_draw(); // for debug
    void glsl_draw(GLint ShaderProgram,List<AnsiString> &log);

    // geometry
    void beg();
    void end();
    void add_box(int x,int y,int z,int rx,int ry,int rz,GLuint col);
    void add_sphere(int x,int y,int z,int r,GLuint col);
    };
//---------------------------------------------------------------------------
void volume::gl_init()
    {
    if (_init) return; _init=true;
    int x,y,z; GLint i;
    glGetIntegerv(GL_MAX_TEXTURE_SIZE,&i); size=i;
    i=32;                      if (size>i) size=i; // force 32x32x32 resolution
    size2=size*size;
    size3=size*size2;     pdata     =new GLuint  [size3];
                          data      =new GLuint**[size];
    for (z=0;z<size;z++){ data[z]   =new GLuint* [size];
    for (y=0;y<size;y++){ data[z][y]=pdata+(z*size2)+(y*size); }}
    glGenTextures(1,&txrvol);
    }
//---------------------------------------------------------------------------
void volume::gl_exit()
    {
    if (!_init) return; _init=false;
    int x,y,z;
    glDeleteTextures(1,&txrvol);
    size=0; size2=0; size3=0;
    for (z=0;z<size;z++){ if (data[z]) delete[] data[z]; }
                          if (data   ) delete[] data;  data =NULL;
                          if (pdata  ) delete[] pdata; pdata=NULL;
    }
//---------------------------------------------------------------------------
void volume::gl_draw()
    {
    int x,y,z;
    float xx,yy,zz,voxel_size=1.0/float(size);
    reper rep;
    double v0[3],v1[3],v2[3],p[3],n[3],q[3],r,sz=0.5;
    glMatrixMode(GL_PROJECTION);
    glPushMatrix();
    glLoadIdentity();
    glPerspective(2.0*atanxy(focal_length,1.0)*rad,1.0,0.1,100.0);
    glScalef(aspect,1.0,1.0);
//  glGetDoublev(GL_PROJECTION_MATRIX,per);
    glScalef(1.0,1.0,-1.0);
    glMatrixMode(GL_MODELVIEW);
    glPushMatrix(); rep=eye;
    rep.lpos_set(vector_ld(0.0,0.0,-focal_length));
    rep.use_inv(); glLoadMatrixd(rep.inv);

    glBegin(GL_POINTS);
    for (zz=-0.0,z=0;z<size;z++,zz+=voxel_size)
     for (yy=-0.0,y=0;y<size;y++,yy+=voxel_size)
      for (xx=-0.0,x=0;x<size;x++,xx+=voxel_size)
       if (data[z][y][x]!=_empty_voxel)
        {
        glColor4ubv((BYTE*)(&data[z][y][x]));
        glVertex3f(xx,yy,zz);
        }
    glEnd();

    glMatrixMode(GL_MODELVIEW);
    glPopMatrix();
    glMatrixMode(GL_PROJECTION);
    glPopMatrix();
    }
//---------------------------------------------------------------------------
void volume::glsl_draw(GLint ShaderProgram,List<AnsiString> &log)
    {
    GLint ix,i;
    GLfloat n[16];
    AnsiString nam;
    const int txru_vol=0;

    // uniforms
    nam="aspect";       ix=glGetUniformLocation(ShaderProgram,nam.c_str()); if (ix<0) log.add(nam); else glUniform1f(ix,aspect);
    nam="focal_length"; ix=glGetUniformLocation(ShaderProgram,nam.c_str()); if (ix<0) log.add(nam); else glUniform1f(ix,focal_length);
    nam="vol_siz";      ix=glGetUniformLocation(ShaderProgram,nam.c_str()); if (ix<0) log.add(nam); else glUniform1i(ix,size);
    nam="vol_txr";      ix=glGetUniformLocation(ShaderProgram,nam.c_str()); if (ix<0) log.add(nam); else glUniform1i(ix,txru_vol);
    nam="tm_eye";       ix=glGetUniformLocation(ShaderProgram,nam.c_str()); if (ix<0) log.add(nam);
    else{ eye.use_rep(); for (int i=0;i<16;i++) n[i]=eye.rep[i]; glUniformMatrix4fv(ix,1,false,n); }

    glActiveTexture(GL_TEXTURE0+txru_vol);
    glEnable(GL_TEXTURE_3D);
    glBindTexture(GL_TEXTURE_3D,txrvol);

    // this should be a VBO
    glColor4f(1.0,1.0,1.0,1.0);
    glBegin(GL_QUADS);

    glVertex2f(-1.0,-1.0);

    glVertex2f(-1.0,+1.0);

    glVertex2f(+1.0,+1.0);

    glVertex2f(+1.0,-1.0);

    glEnd();

    glActiveTexture(GL_TEXTURE0+txru_vol);
    glBindTexture(GL_TEXTURE_3D,0);
    glDisable(GL_TEXTURE_3D);
    }
//---------------------------------------------------------------------------
void volume::beg()
    {
    if (!_init) return;
    for (int i=0;i<size3;i++) pdata[i]=_empty_voxel;
    }
//---------------------------------------------------------------------------
void volume::end()
    {
    if (!_init) return;
    int z;
    // volume texture init
    glEnable(GL_TEXTURE_3D);
    glBindTexture(GL_TEXTURE_3D,txrvol);
    glPixelStorei(GL_UNPACK_ALIGNMENT, 4);
    glTexParameteri(GL_TEXTURE_3D, GL_TEXTURE_WRAP_S,GL_CLAMP_TO_EDGE);
    glTexParameteri(GL_TEXTURE_3D, GL_TEXTURE_WRAP_T,GL_CLAMP_TO_EDGE);
    glTexParameteri(GL_TEXTURE_3D, GL_TEXTURE_WRAP_R,GL_CLAMP_TO_EDGE);
    glTexParameteri(GL_TEXTURE_3D, GL_TEXTURE_MAG_FILTER,GL_NEAREST);
    glTexParameteri(GL_TEXTURE_3D, GL_TEXTURE_MIN_FILTER,GL_NEAREST);
    glTexEnvf(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE,GL_MODULATE);
    glTexImage3D(GL_TEXTURE_3D, 0, GL_RGBA8, size, size, size, 0, GL_RGBA, GL_UNSIGNED_BYTE, pdata);
    glDisable(GL_TEXTURE_3D);
    }
//---------------------------------------------------------------------------
void volume::add_box(int x0,int y0,int z0,int rx,int ry,int rz,GLuint col)
    {
    if (!_init) return;
    int x1,y1,z1,x,y,z;
    x1=x0+rx; x0-=rx; if (x0<0) x0=0; if (x1>=size) x1=size;
    y1=y0+ry; y0-=ry; if (y0<0) y0=0; if (y1>=size) y1=size;
    z1=z0+rz; z0-=rz; if (z0<0) z0=0; if (z1>=size) z1=size;
    for (z=z0;z<=z1;z++)
     for (y=y0;y<=y1;y++)
      for (x=x0;x<=x1;x++)
       data[z][y][x]=col;
    }
//---------------------------------------------------------------------------
void volume::add_sphere(int cx,int cy,int cz,int r,GLuint col)
    {
    if (!_init) return;
    int x0,y0,z0,x1,y1,z1,x,y,z,xx,yy,zz,rr=r*r;
    x0=cx-r; x1=cx+r; if (x0<0) x0=0; if (x1>=size) x1=size;
    y0=cy-r; y1=cy+r; if (y0<0) y0=0; if (y1>=size) y1=size;
    z0=cz-r; z1=cz+r; if (z0<0) z0=0; if (z1>=size) z1=size;
    for (z=z0;z<=z1;z++)
     for (zz=z-cz,zz*=zz,y=y0;y<=y1;y++)
      for (yy=y-cy,yy*=yy,x=x0;x<=x1;x++)
        {   xx=x-cx;xx*=xx;
        if (xx+yy+zz<=rr)
         data[z][y][x]=col;
        }
    }
//---------------------------------------------------------------------------
#endif
//---------------------------------------------------------------------------

Volume is initiated and used like this:

// [globals]
volume vol;    

// [On init]
// here init OpenGL and extentions (GLEW)
// load/compile/link shaders

// init of volume data
vol.gl_init(); 
vol.beg();
vol.add_sphere(16,16,16,10,0x00FF8040);
vol.add_sphere(23,16,16,8,0x004080FF);
vol.add_box(16,24,16,2,6,2,0x0060FF60);
vol.add_box(10,10,20,3,3,3,0x00FF2020);
vol.add_box(20,10,10,3,3,3,0x002020FF);
vol.end(); // this copies the CPU side volume array to 3D texture

// [on render]
// clear screen what ever
// bind shader
vol.glsl_draw(shader,log); // log is list of strings I use for errors you can ignore/remove it from code
// unbind shader
// add HUD or what ever
// refresh buffers

// [on exit]
vol.gl_exit();
// free what ever you need to like GL,...

the vol.glsl_draw() renders the stuff... Do not forget to call gl_exit before shutdown of app.

Here Vertex shader:

//------------------------------------------------------------------
#version 420 core
//------------------------------------------------------------------
uniform float aspect;
uniform float focal_length;
uniform mat4x4 tm_eye;
layout(location=0) in vec2 pos;

out smooth vec3 ray_pos;    // ray start position
out smooth vec3 ray_dir;    // ray start direction
//------------------------------------------------------------------
void main(void)
    {
    vec4 p;
    // perspective projection
    p=tm_eye*vec4(pos.x/aspect,pos.y,0.0,1.0);
    ray_pos=p.xyz;
    p-=tm_eye*vec4(0.0,0.0,-focal_length,1.0);
    ray_dir=normalize(p.xyz);
    gl_Position=vec4(pos,0.0,1.0);
    }
//------------------------------------------------------------------

And Fragment:

//------------------------------------------------------------------
#version 420 core
//------------------------------------------------------------------
// Ray tracer ver: 1.000
//------------------------------------------------------------------
in smooth vec3      ray_pos;    // ray start position
in smooth vec3      ray_dir;    // ray start direction
uniform int         vol_siz;    // square texture x,y resolution size
uniform sampler3D   vol_txr;    // scene mesh data texture
out layout(location=0) vec4 frag_col;
//---------------------------------------------------------------------------
void main(void)
    {
    const vec3 light_dir=normalize(vec3(0.1,0.1,-1.0));
    const float light_amb=0.1;
    const float light_dif=0.5;
    const vec4 back_col=vec4(0.1,0.1,0.1,1.0);  // background color
    const float _zero=1e-6;
    const vec4 _empty_voxel=vec4(0.0,0.0,0.0,0.0);
    vec4 col=back_col,c;
    const float n=vol_siz;
    const float _n=1.0/n;

    vec3  p,dp,dq,dir=normalize(ray_dir),nor=vec3(0.0,0.0,0.0),nnor=nor;
    float l=1e20,ll,dl;

    // Ray trace
    #define castray\
    for (ll=length(p-ray_pos),dl=length(dp),p-=0.0*dp;;)\
        {\
        if (ll>l) break;\
        if ((dp.x<-_zero)&&(p.x<0.0)) break;\
        if ((dp.x>+_zero)&&(p.x>1.0)) break;\
        if ((dp.y<-_zero)&&(p.y<0.0)) break;\
        if ((dp.y>+_zero)&&(p.y>1.0)) break;\
        if ((dp.z<-_zero)&&(p.z<0.0)) break;\
        if ((dp.z>+_zero)&&(p.z>1.0)) break;\
        if ((p.x>=0.0)&&(p.x<=1.0)\
          &&(p.y>=0.0)&&(p.y<=1.0)\
          &&(p.z>=0.0)&&(p.z<=1.0))\
            {\
            c=texture(vol_txr,p);\
            if (c!=_empty_voxel){ col=c; l=ll; nor=nnor; break; }\
            }\
        p+=dp; ll+=dl;\
        }

    // YZ plane voxels hits
    if (abs(dir.x)>_zero)
        {
        // compute start position aligned grid
        p=ray_pos;
        if (dir.x<0.0) { p+=dir*(((floor(p.x*n)-_zero)*_n)-ray_pos.x)/dir.x; nnor=vec3(+1.0,0.0,0.0); }
        if (dir.x>0.0) { p+=dir*((( ceil(p.x*n)+_zero)*_n)-ray_pos.x)/dir.x; nnor=vec3(-1.0,0.0,0.0); }
        // single voxel step
        dp=dir/abs(dir.x*n);
        // Ray trace
        castray;
        }
    // ZX plane voxels hits
    if (abs(dir.y)>_zero)
        {
        // compute start position aligned grid
        p=ray_pos;
        if (dir.y<0.0) { p+=dir*(((floor(p.y*n)-_zero)*_n)-ray_pos.y)/dir.y; nnor=vec3(0.0,+1.0,0.0); }
        if (dir.y>0.0) { p+=dir*((( ceil(p.y*n)+_zero)*_n)-ray_pos.y)/dir.y; nnor=vec3(0.0,-1.0,0.0); }
        // single voxel step
        dp=dir/abs(dir.y*n);
        // Ray trace
        castray;
        }
    // XY plane voxels hits
    if (abs(dir.z)>_zero)
        {
        // compute start position aligned grid
        p=ray_pos;
        if (dir.z<0.0) { p+=dir*(((floor(p.z*n)-_zero)*_n)-ray_pos.z)/dir.z; nnor=vec3(0.0,0.0,+1.0); }
        if (dir.z>0.0) { p+=dir*((( ceil(p.z*n)+_zero)*_n)-ray_pos.z)/dir.z; nnor=vec3(0.0,0.0,-1.0); }
        // single voxel step
        dp=dir/abs(dir.z*n);
        // Ray trace
        castray;
        }

    // final color and lighting output
    if (col!=back_col) col.rgb*=light_amb+light_dif*max(0.0,dot(light_dir,nor));
    frag_col=col;
    }
//---------------------------------------------------------------------------

As you can see it is very similar to the Mesh Raytracer I linked above (it was done from it). The ray tracer is simply this Doom technique ported to 3D.

I used my own engine and VCL so you need to port it to your environment (AnsiString strings and shader loading/compiling/linking and list<>) for more info see the simple GL... link. Also I mix old GL 1.0 and core GLSL stuff which is not recommended (I wanted to keep it as simple as I could) so you should convert the single Quad to VBO.

the glsl_draw() requires the shaders are linked and binded already where ShaderProgram is the id of the shaders.

The volume is mapped from (0.0,0.0,0.0) to (1.0,1.0,1.0). Camera is in form of direct matrix tm_eye. The reper class is just mine 4x4 transform matrix holding both direct rep and inverse inv matrix something like GLM.

Volume resolution is set in gl_init() hardcoded to 32x32x32 so just change the line i=32 to what you need.

The code is not optimized nor heavily tested but looks like it works. The timings in the screenshot tells not much as there is huge overhead during the runtime as I have this as a part of larger app. Only the tim value is more or less reliable but does not change much with bigger resolutions (probably till some bottleneck is hit like memory size or resolution of screen vs. frame rate) Here screenshot of the whole app (so you have an idea what else is running):

IDE

查看更多
爷的心禁止访问
3楼-- · 2019-01-03 03:57

If you are doing separate draw calls and invoking shader execution for each specific cube that is going to be a massive perf loss. I would definitely recommend instancing - this way your code can have a single draw call and all cubes will be rendered.

Look up documentation for glDrawElementsInstanced, however this approach also means that you will have to have a "buffer" of matrices, one for each voxel cube, and will have to access each one in the shader using gl_InstanceID to index into the correct matrix.

Regarding the depth buffer, there will be savings on your rendering if the cube matrices are somehow sorted front - to - back from the camera so there is the performance benefit of an early-z depth test fail for any possible fragment that lies behind an already-rendered voxel cube.

查看更多
登录 后发表回答