A beginner's attempt on image filtering

2020-07-27 02:50发布

问题:

My question might sound stupid but please understand i don't study in a university/college where a professor could give me decent help. I am self learning and its hardly been 3-4 months of coding for me. Hence i request patience from SO Users.

I am trying to code a simple filter function where i define a kernel, a 3x3 2D Array. I create another matrix load, where i would like to store the pixel values of my Image array. I face 2 Major problems. For the 1st problem i am completely confused and bewildered. I read this brilliant answer about How do I gaussian blur an image without using any in-built gaussian functions?, it mentions:

For pixel 11 you would need to load pixels 0, 1, 2, 10, 11, 12, 20, 21, 22.

you would then multiply pixel 0 by the upper left portion of the 3x3 blur filter. Pixel 1 by the top middle, pixel 2, pixel 3 by top right, pixel 10 by middle left and so on.

I would like to know for an IplImage with 3 channels how do i store the corresponding pixels [as mentioned in the link above] in my load matrix because what i am confused about is there are 3 values [R G B], so i am supposed to multiply what with what??

Also how to ensure that the Pixels do not go out of bounds? Since once we are near the edges of the image the pixel value might go out of bounds.

void filter(const IplImage *img) 
{
    unsigned char kernel[][3]  = { 1,  2,  1, 
                                   2,  1,  2, 
                                   1,  2,  1 , };
    unsigned char load[][3] = { 0 };
    int rows=img->height,cols=img->width,row,col;
    uchar* temp_ptr=0 ;

    for( row = 0; row < rows; ++row) 
    {

            for ( col = 0; col < cols; ++col) 
            {
                CvPoint pt = {row,col};
                temp_ptr  = &((uchar*)(img->imageData + (img->widthStep*row)))[col*3];

            }
    }

}

回答1:

To your RGB question: You apply the kernel separately to each channel. That is, you basically treat the three channels as three independent monochrome pictures and blur each one.

Ensuring that your pixels don't go out of bounds in principle is simple: When you access the pixel, you first test whether the pixel would be out of bounds, and if so, don't access it. However the interesting question is what to do instead. One possibility would be to just not apply the filter to the border points where the kernel would exceed the image, but that would mean to leave an unblurred border (unless you can afford to lose a one pixel border, in that case this would probably be the best solution). Another solution would be to assume a certain color for outside pixels (e.g. white). The third option, which I think is the best here, is to just not use those points but work with the reduced kernel only containing the in-bounds pixels of the mask. Note that you need to adapt normalization factors as well in this case.

And BTW, are you sure your kernel describes a blur? I would have expected the middle value to be the largest for a blur.



回答2:

1) Filtering an image by a linear system is equivalent to what we call a convolution. It is really easy to realize, it is a term by term multiplication of your image with the symmetric kernel (central symmetry). If your kernel is symmetric, as it is the case for a gaussian kernel then it is a simple term by term multiplication.

You move your kernel over the image and as you say there can be out of bounds exceptions... Two options, or you don't filter the borders (it's mostly the case) or you interpolate the result (it may be too difficult if you are starting in image processing so, let's do something simple).

The algorithm will look like, notice that the col and rows start at 1 and end at width-1 and height-1 to avoid out of bounds problems... If you have a larger kernel, you would do the same with width-kernelSize/2 kernelSize is an odd number 3x3,... 7x7... etc.

kernel[9] = {1/13, 2/13, 1/13, 2/13, 1/13, 2/13, 1/13, 2/13, 1/13};

for(row=1;row<img->height-1;row++)
{
     for(col=1;row<img->width-1;col++)
     {
           img->data[row*img->width+3*col+0] = ... ; //B Channel
           img->data[row*img->width+3*col+1] = ... ; //G Channel
           img->data[row*img->width+3*col+2] = ... ; //R Channel
      }
}

Instead of the ... or you use a loop to compute the convolution and then replace the ... by the result of the loop (which is the most flexible solution if sometimes you want a larger kernel (ex 9x9), or you write by hand :

double convB = kernel[0]*img->data[(row-1)*img->width+3*(col-1)+0]
+ kernel[1]*img->data[(row-1)*img->width+3*(col)+0]
+ kernel[2]*img->data[(row-1)*img->width+3*(col+1)+0]
+ kernel[3]*img->data[(row)*img->width+3*(col-1)+0]
+ kernel[4]*img->data[(row)*img->width+3*(col)+0]
+ kernel[5]*img->data[(row)*img->width+3*(col+1)+0]
+ kernel[6]*img->data[(row+1)*img->width+3*(col-1)+0]
+ kernel[7]*img->data[(row+1)*img->width+3*(col)+0]
+ kernel[8]*img->data[(row+1)*img->width+3*(col+1)+0];

2) For G and R channels you use the same with +1 and +2 instead of +0 in the brackets [] to address the correct memory location;

3) You should use a Kernel which is normalized, that's why I used value x 1/13 (which is the sum of all the values), if you want a normalized result on your image (with normalized intensities).



标签: c++ opencv