Full screen background texture with OpenGL perform

2019-03-29 11:21发布

问题:

I am quite puzzled with the poor performance I'm seeing when drawing a full screen background using a textured triangle mesh in OpenGL: drawing just the background and nothing else maxes out at 40 fps using the most basic shader, and 50 fps using the default pipeline.

While 40 fps doesn't seem too bad, adding anything else on top of that makes the fps drop, and considering I need to draw 100-200 other meshes on top of that, I end up with a paltry 15 fps that is simply not usable.

I have isolated the relevant code into an XCode project available here, but the essence of it is the canonical texture map example:

static const GLfloat squareVertices[] = {
    -1.0f, -1.0f,
    1.0f, -1.0f,
    -1.0f,  1.0f,
    1.0f,  1.0f,
};
static const GLfloat texCoords[] = {
    0.125, 1.0,
    0.875, 1.0,
    0.125, 0.0,
    0.875, 0.0
};


glClearColor(0.5f, 0.5f, 0.5f, 1.0f);
glClear(GL_COLOR_BUFFER_BIT);

if ([context API] == kEAGLRenderingAPIOpenGLES2) {
    // Use shader program.
    glUseProgram(program);

    glActiveTexture(GL_TEXTURE0);
    glUniform1i(uniforms[UNIFORM_TEXTURE], 0);
    glBindTexture(GL_TEXTURE_2D, texture);

    // Update attribute values.
    glVertexAttribPointer(ATTRIB_VERTEX, 2, GL_FLOAT, 0, 0, squareVertices);
    glEnableVertexAttribArray(ATTRIB_VERTEX);
    glVertexAttribPointer(ATTRIB_TEXCOORD, 2, GL_FLOAT, GL_FALSE, 0, texCoords);
    glEnableVertexAttribArray(ATTRIB_TEXCOORD);
} else {
    glMatrixMode(GL_PROJECTION);
    glLoadIdentity();
    glMatrixMode(GL_MODELVIEW);
    glLoadIdentity();

    glEnable( GL_TEXTURE_2D );
    glBindTexture(GL_TEXTURE_2D, texture);
    glTexEnvf( GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_REPLACE );
    glTexParameterf( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR );
    glTexParameterf( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR );
    glTexParameterf( GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT );
    glTexParameterf( GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT ); 

    glVertexPointer(2, GL_FLOAT, 0, squareVertices);
    glEnableClientState(GL_VERTEX_ARRAY);
    glTexCoordPointer(2, GL_FLOAT, 0, texCoords);
    glEnableClientState(GL_TEXTURE_COORD_ARRAY);
}

glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);

The vertex shader:

attribute lowp vec4 position;
attribute lowp vec2 tex;

varying lowp vec2 texCoord;

uniform float translate;

void main()
{
    gl_Position = position;
    texCoord = tex;
}

The fragment shader:

varying lowp vec2   texCoord;
uniform sampler2D   texture;

void main()
{
    gl_FragColor = texture2D(texture, texCoord);
}

Dividing the rectangle size by two doubles the frame rate, so the rendering time is clearly dependent on the real estate the drawing takes on the screen. This totally makes sense, but what does not make sense to me is that it doesn't seem possible to cover the whole screen with OpenGL texture-mapped meshes at more than 15 fps.

Yet there are hundreds of games out there that do it, so it is possible and I must be doing something wrong, but what is it?

回答1:

Unfortunately, all I have is my iPad 2 to test this right now (my iPad 1 test unit is sitting at home), and it has ridiculously fast fragment processing. It's being pegged at 60 FPS, with 1400 theoretical FPS in your logging.

However, I ran it through Instruments using the OpenGL ES Driver and Time Profiler instruments, along with the cool new OpenGL ES Analyzer (which comes with Xcode 4). This is what the results from the OpenGL ES Analyzer look like:

Looking at the Tiler Utilization statistic in the OpenGL ES driver shows the tiler barely being used at all, but the renderer having some use (again, only 5% on my iPad 2). This indicates that the suggestions to use VBOs and indices for your geometry probably won't do much for you.

The one that sticks out is the warning about redundant calls:

You keep binding the framebuffer and setting up the viewport every frame, which according to Time Profiler is accounting for 10% of the work load in your application. Commenting out the line

[(EAGLView *)self.view setFramebuffer];

near the beginning of your frame drawing caused the theoretical framerate to jump from 1400 FPS to 27000 FPS on my iPad 2 (as an aside, you should probably measure using milliseconds for your rendering).

Again, this is me running tests on the really powerful GPU in the iPad 2, but you should be able to repeat these similar steps on the original iPad or any other device to verify this performance bottleneck and potentially highlight others. I've found the new OpenGL ES Analyzer to be really handy in picking up shader-related performance issues.



回答2:

Wild guess following, I don't have any iPad experience :

according to this benchmark, you can expect to have about 214fps in pure fill rate, at background resolution.

Did you try to disable the texturing, to check if you're limited by your texture ?

Is your texture a 'non power of two texture' ? In this case, did you try to remove the GL_REPEAT from GL_TEXTURE_WRAP_*, by replacing by GL_CLAMP(_TO_EDGE) ? Repeating a NPOT can cost some performance on some hardware.

Ultimately, you could try to set min/max filters to GL_NEAREST too.