Dynamic tile display optimalization in OpenGL

2019-07-26 13:53发布

问题:

I am working on a tile based, top-down 2D game with dinamically generated terrain, and started (re)writing the graphics engine in OpenGL. The game is written in Java using LWJGL, and I'd prefer it to stay relatively platform-independent, and playable on older computers too.

Currently I'm using immediate mode for drawing, but obviously this is too slow for anything but the simplest scenes.

There are two basic types of objects that are drawn: Tiles, which is the world, and Sprites, which is pretty much everything else (Entities, items, effects, ect).

The tiles are 20*20 px, and are stored in chunks (40*40 tiles). Terrain generation is done in full chunks, like in Minecraft.
The method I use now is iterating over the 9 chunks near the player, and then iterating over each tile inside, drawing one quad for the tile texture, and optional extra quads for features depending on the material. This ends up quite slow, but a simple out-of-view check gives a 5-10x FPS boost.

For optimizing this, I looked into using VBOs and quad strips, but I have a problem when terrain changes. This doesn't happen every frame, but not a very rare event either. A simple method would be dropping and rebuilding a chunk's VBO every time it changes. This doesn't seem the best way though. I read that VBOs can be "dynamic" allowing their content to be changed. How can this be done, and what data can be changed inside them efficiently? Are there any other ways for efficiently drawing the world?

The other type, sprites, are currently drawn with a quad with a texture mapped from a sprite sheet. So by changing texture coordinates, I can even animate them later. Is this the correct way to do the aniamtion though?
Currently even a very high number of sprites won't slow the game down much, and by understanding VBOs, I'll be able to speed them up even more, but I haven't seen any solid and reliable tutorials for an efficient way of doing this. Does anyone know one perhaps?

Thanks for the help!

回答1:

Currently I'm using immediate mode for drawing, but obviously this is too slow for anything but the simplest scenes.

I disagree. Unless you are drawing a lot of tiles (tens of thousands per frame), immediate mode should be just fine for you.

The key is something you will have to be doing to get good performance anyway: texture atlases. All of your tiles should be stored in a single texture. You use texture coordinate to pull different tiles out of that texture when rendering. So if this is what your render loop looks like now:

for(tile in tileList) //Pseudocode. Not actual C++
{
    glBindTexture(GL_TEXTURE_2D, tile.texture);
    glBegin(GL_QUADS);
        glTexCoord2f(0.0f, 0.0f);
        glVertex2fv(tile.lowerLeft);
        glTexCoord2f(0.0f, 1.0f);
        glVertex2fv(tile.upperLeft);
        glTexCoord2f(1.0f, 1.0f);
        glVertex2fv(tile.upperRight);
        glTexCoord2f(1.0f, 0.0f);
        glVertex2fv(tile.lowerRight);
    glEnd();
}

You can convert it into this:

glBindTexture(GL_TEXTURE_2D, allTilesTexture);
glBegin(GL_QUADS);
for(tile in tileList) //Still pseudocode.
{
    glTexCoord2f(tile.texCoord.lowerLeft);
    glVertex2fv(tile.lowerLeft);
    glTexCoord2f(tile.texCoord.upperLeft);
    glVertex2fv(tile.upperLeft);
    glTexCoord2f(tile.texCoord.upperRight);
    glVertex2fv(tile.upperRight);
    glTexCoord2f(tile.texCoord.lowerRight);
    glVertex2fv(tile.lowerRight);
}
glEnd();

If you are already using a texture atlas and still aren't getting acceptable performance, then you can move on to buffer objects and the like. But you won't get any better performance from buffer objects if you don't do this first.

If all of your tiles cannot fit into a single texture, then you will need to do one of two things: use multiple textures (rendering as many tiles with each texture in one glBegin/glEnd pair as possible), or use a texture array. Texture arrays are available in OpenGL 3.0-level hardware only. That means any Radeon HDxxxx or GeForce 8xxxx or better.

You mentioned that you sometimes render "features" on top of tiles. These features likely use blending and different glTexEnv modes from regular tiles. In this case, you need to find ways to group similar features into a single glBegin/glEnd pair.

As you may be gathering from this, the key to performance is minimizing the number of times you call glBindTexture and glBegin/glEnd. Do as much work as possible in each glBegin/glEnd.

If you wish to proceed with a buffer-based approach (and you should only bother if the texture atlas approach didn't get your performance up to par), it's fairly simple. Put all of your tile "chunks" into a single buffer object. Don't make a buffer for each one; there's no real reason to do so, and 40x40 tiles worth of vertex data is only 12,800 bytes. You can put 81 such chunks in a single 1MB buffer. This way, you only have to call glBindBuffer for your terrain. Which again, saves you performance.

I would need to know more about these "features" you sometimes use to suggest a way to optimize them. But as for dynamic buffers, I wouldn't worry. Just use glBufferSubData to update the part of the buffer in question. If this turns out to be slow, there are several options for making it faster that you can employ. But you shouldn't bother unless you know that it is necessary, since they're complex.

Sprites are probably something that benefits the absolute least from a buffer object approach. There's really nothing to be gained by it over immediate mode. Even if you're rendering hundreds of them, each one will have its own transformation matrix. Which means that each one will have to be a separate draw call. So it may as well be glBegin/glEnd.