I am trying to implement Blur effect in my game on mobile devices using GLSL shader. I don't have any former experience with writing shaders. And I don't understand if my shader is enough good. Actually I have copyied the GLSL code from a tutorial and I don't know it this tutorial is for vivid demo or also can be used in practice. Here is the code of two pass blur shader that uses Gaussian weights (http://www.cocos2d-x.org/wiki/User_Tutorial-RenderTexture_Plus_Blur):
#ifdef GL_ES
precision mediump float;
#endif
varying vec4 v_fragmentColor;
varying vec2 v_texCoord;
uniform vec2 pixelSize;
uniform vec2 direction;
uniform int radius;
uniform float weights[64];
void main()
{
gl_FragColor = texture2D(CC_Texture0, v_texCoord)*weights[0];
for (int i = 1; i < radius; i++) {
vec2 offset = vec2(float(i)*pixelSize.x*direction.x, float(i)*pixelSize.y*direction.y);
gl_FragColor += texture2D(CC_Texture0, v_texCoord + offset)*weights[i];
gl_FragColor += texture2D(CC_Texture0, v_texCoord - offset)*weights[i];
}
}
I run this shader on each frame update (60 times in a sec) and my game framerate for only one pass drops down to 22 FPS on iPhone 5S (not a bad device). I think this is very-very strange. it seems it has not to much instruction. Why this is so heavy?
P.S. Blur radius is 50, step is 1.
Main reasons why your shader is heavy:
1: This two calculations:
v_texCoord + offset
andv_texCoord - offset
. because the uv coordonates are computed in the fragment shader the texture data has to be loaded from memory on the spot causing cache miss.2:
radius
is way to large.How to make it faster/better:
1: Calculate as much as possible in the vertex shader. Ideally if you calculate all the UV's in the vertex shader the GPU can move the texture memory in cache before calling fragment shaders, drastically improving performance.
2: reduce
Radius
to accommodate let's say 8-16texture2D
calls. This will probably not give you the result you are expecting, and to solve this you can have 2 textures, blurring texture A into B , then blur again B into texture A and so on, as mush as you need. This will give very good results, i remember crisys 1 used it for motion blur , but i can't find the paper.3: eliminate those 64 uniforms, have all the data hardcoded in the shader. I know that this is not that nice but you will gain some extra performance.
4: If you carefully calculate the UV coordinates you can take great advantage of texture interpolation. Basically never sample a pixel on it's center, always sample in between pixels and the hardware will do and avrage of the near 4 pixels:
5: This line:
precision mediump float;
does everything have to bemediump
? I would suggest to remove it and do some testing withlowp
on as much as you can.Edit: For you shader, here is a simplified version of what you need to do:
Vertex shader:
The last operation
pixel_size * 0.5
is required to take maximum advantage of linear interpolation. In this example the position you pick for sampling are trivial but there is an entire discussion on how you should pick your sampling positions that is way out of the scope of this question.Fragment shader:
For this to look good you need to blur the texture multiple times, even if you blur the texture 2 times you should see a notable difference.
To speed up you can: