I want to use two PBOs to read pixel in alternative way. I thought the PBO way will much faster, because glReadPixels returns immediately when using PBO, and a lot of time can be overlapped.
Strangely there seems to be not much benefit. Considering some code like:
glBindBufferARB(GL_PIXEL_PACK_BUFFER_ARB, 0);
Timer t; t.start();
glReadPixels(0,0,1024,1024,GL_RGBA, GL_UNSIGNED_BYTE, buf);
t.stop(); std::cout << t.getElapsedTimeInMilliSec() << " ";
glBindBufferARB(GL_PIXEL_PACK_BUFFER_ARB, pbo);
t.start();
glReadPixels(0,0,1024,1024,GL_RGBA, GL_UNSIGNED_BYTE, 0);
t.stop(); std::cout << t.getElapsedTimeInMilliSec() << std::endl;
The result is
1.301 1.185
1.294 1.19
1.28 1.191
1.341 1.254
1.327 1.201
1.304 1.19
1.352 1.235
The PBO way is a little faster, but not a satisfactory immediate-return。
My question is:
- What is the factor affecting glReadPixels' performance? Somethimes, the cost of it reaches 10ms, but 1.3ms here.
Why immediate-return costs as much as 1.2ms? Is it too big or just normal?
===========================================================================
According to comparison with a demo, I found two factors:
- GL_BGRA is better than GL_RGBA, 1.3ms=>1.0ms(no PBO), 1.2ms=>0.9ms(with pbo)
- glutInitDisplayMode(GLUT_RGB|GLUT_ALPHA) rather than GLUT_RGBA, 0.9ms=>0.01ms。That's the performance I want. In my system, GLUT_RGBA=GLUT_RGB=0. GLUT_ALPHA=8
Then another two questions:
- Why GL_BGRA is better than GL_RGBA? Is it the case for just specific platform or for all platforms?
- Why GLUT_ALPHA is so important that it affects PBO performance hugely?