Efficient Bicubic filtering code in GLSL?

I'm wondering if anyone has complete, working, and efficient code to do bicubic texture filtering in glsl. There is this:

http://www.codeproject.com/Articles/236394/Bi-Cubic-and-Bi-Linear-Interpolation-with-GLSL or https://github.com/visionworkbench/visionworkbench/blob/master/src/vw/GPU/Shaders/Interp/interpolation-bicubic.glsl

but both do 16 texture reads where only 4 are necessary:

https://groups.google.com/forum/#!topic/comp.graphics.api.opengl/kqrujgJfTxo

However the method above uses a missing "cubic()" function that I don't know what it is supposed to do, and also takes an unexplained "texscale" parameter.

There is also the NVidia version:

http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter20.html

but I believe this uses CUDA, which is specific to NVidia's cards. I need glsl.

I could probably port the nvidia version to glsl, but thought I'd ask first to see if anyone already has a complete, working glsl bicubic shader.

标签： glsl fragment-shader bicubic

8条回答

女痞

2楼-- · 2019-03-08 17:14

Wow. I recognize the code above (I can not comment w/ reputation < 50) as I came up with it in early 2011. The problem I was trying to solve was related to old IBM T42 (sorry the exact model number escapes me) laptop and it's ATI graphics stack. I developed the code on NV card and originally I used 16 texture fetches. That was kinda of slow but fast enough for my purposes. When someone reported it did not work on his laptop it became apparent that they did not support enough texture fetches per fragment. I had to engineer a work-around and the best I could come up with was to do it with number of texture fetches that would work.

I thought about it like this: okay, so if I handle each quad (2x2) with linear filter the remaining problem is can the rows and columns share the weights? That was the only problem on my mind when I set out to craft the code. Of course they could be shared; the weights are same for each column and row; perfect!

Now I had four samples. The remaining problem was how to correctly combine the samples. That was the biggest obstacle to overcome. It took about 10 minutes with pencil and paper. With trembling hands I typed the code in and it worked, nice. Then I uploaded the binaries to the guy who promised to check it out on his T42 (?) and he reported it worked. The end. :)

I can assure that the equations check out and give mathematically identical results to computing the samples individually. FYI: with CPU it's faster to do horizontal and vertical scan separately. With GPU multiple passes is not that great idea, especially when it's probably not feasible anyway in typical use case.

Food for thought: it is possible to use a texture lookup for the cubic() function. Which is faster depends on the GPU but generally speaking, the sampler is light on the ALU side just doing the arithmetic would balance things out. YMMV.

0人赞添加讨论(0) 举报

做个烂人

3楼-- · 2019-03-08 17:21

I decided to take a minute to dig my old Perforce activities and found the missing cubic() function; enjoy! :)

vec4 cubic(float v)
{
    vec4 n = vec4(1.0, 2.0, 3.0, 4.0) - v;
    vec4 s = n * n * n;
    float x = s.x;
    float y = s.y - 4.0 * s.x;
    float z = s.z - 4.0 * s.y + 6.0 * s.x;
    float w = 6.0 - x - y - z;
    return vec4(x, y, z, w);
}

0人赞添加讨论(0) 举报

Fickle 薄情

4楼-- · 2019-03-08 17:25

The missing function cubic() in JAre's answer could look like this:

vec4 cubic(float x)
{
    float x2 = x * x;
    float x3 = x2 * x;
    vec4 w;
    w.x =   -x3 + 3*x2 - 3*x + 1;
    w.y =  3*x3 - 6*x2       + 4;
    w.z = -3*x3 + 3*x2 + 3*x + 1;
    w.w =  x3;
    return w / 6.f;
}

It returns the four weights for cubic B-Spline.

It is all explained in NVidia Gems.

0人赞添加讨论(0) 举报

放我归山

5楼-- · 2019-03-08 17:29

For anybody interested in GLSL code to do tri-cubic interpolation, ray-casting code using cubic interpolation can be found in the examples/glCubicRayCast folder in: http://www.dannyruijters.nl/cubicinterpolation/CI.zip

edit: The cubic interpolation code is now available on github: CUDA version and WebGL version, and GLSL sample.

0人赞添加讨论(0) 举报

可以哭但决不认输i

6楼-- · 2019-03-08 17:29

I've been using @Maf 's cubic spline recipe for over a year, and I recommend it, if a cubic B-spline meets your needs.

But I recently realized that, for my particular application, it is important for the intensities to match exactly at the sample points. So I switched to using a Catmull-Rom spline, which uses a slightly different recipe like so:

// Catmull-Rom spline actually passes through control points
vec4 cubic(float x) // cubic_catmullrom(float x)
{
    const float s = 0.5; // potentially adjustable parameter
    float x2 = x * x;
    float x3 = x2 * x;
    vec4 w;
    w.x =    -s*x3 +     2*s*x2 - s*x + 0;
    w.y = (2-s)*x3 +   (s-3)*x2       + 1;
    w.z = (s-2)*x3 + (3-2*s)*x2 + s*x + 0;
    w.w =     s*x3 -       s*x2       + 0;
    return w;
}

I found these coefficients, plus those for a number of other flavors of cubic splines, in the lecture notes at: http://www.cs.cmu.edu/afs/cs/academic/class/15462-s10/www/lec-slides/lec06.pdf

0人赞添加讨论(0) 举报

一夜七次

7楼-- · 2019-03-08 17:31

I think it is possible that the Catmull version could be done with 4 texture lookups by (a) arranging the input texture like a chessboard with alternate slots saved as positives and as negatives, and (b) an associated modification of textureBicubic. That would rely on the contributions/weights w.x/w.w always being negative, and the contributions w.y/w.z always being positive. I haven't double-checked if this is true, or exactly how the modified textureBicubic would look.

... I have verified that w contributions do satisfy the +ve -ve rules.

0人赞添加讨论(0) 举报

1 2 下一页

Efficient Bicubic filtering code in GLSL?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间