I have been trying to do a program that applies a mean filter over images, and I think I am close to do it correctly, but there still small flaws in the images. For instance:
Original racing: http://s72.photobucket.com/user/john_smith140/media/gp4_zpstafhejk5.jpg.html?filters[user]=139318132&filters[recent]=1&sort=1&o=2
Original Triangles: http://s72.photobucket.com/user/john_smith140/media/input_zpsz2cfhrc7.jpeg.html?filters[user]=139318132&filters[recent]=1&sort=1&o=3
Modified racing: http://s72.photobucket.com/user/john_smith140/media/racing_zpsmzmawjml.jpeg.html?filters[user]=139318132&filters[recent]=1&sort=1&o=0
Modififed triangles: http://s72.photobucket.com/user/john_smith140/media/triangles_zpsaretjfid.jpeg.html?filters[user]=139318132&filters[recent]=1&sort=1&o=1
black background white dots, original: http://s72.photobucket.com/user/john_smith140/media/black%20background%20white%20dots_zpsuofaagnl.jpg.html?sort=3&o=2
black background white dots, same array: http://s72.photobucket.com/user/john_smith140/media/one%20array_zpswteno2eb.jpg.html?sort=3&o=1
black background white dots, different arrays: http://s72.photobucket.com/user/john_smith140/media/two%20array_zpskbyjg97o.jpg.html?sort=3&o=0
I can think into two causes for the flaws. One the algorithm itself and the other in the process of convert a char to float and then float to char again.
Char to float conversion it is necessary because the read function of ifstream reads char and then I need to multiply every by 1/9, so it needs to be floating point. Then convert back to char so the write function can write it back.
Some explanations about the algorithm. I start to calculate the color value from the second pixel of the second row and then proceeds until the second last pixel of the second last row. That's because I am using a kernel of 3x3, that way I don't go beyond the limits of the image (and so of the char array in which I stored it). For a image of 1024x768, it will have size of 1024x768*3 (3 color components). So it starts from position: bitsPerPixel * image_width + bitsPerPixel or 3*1024+ 3 = 4099, the 2° pixel of the 2° row. Then it will calculate the mean until the 2° last pixel of the 2° last row which should be:
imageSize - row_size - bitsPerPixel or (1024*768-3) - 1024*3 - 3.
In the interval it will calculate the value of every position in the char array, which means the value of each color channel of a pixel will be calculated by the color channel of the surrounding pixels.
Here is the code:
int size2 = bpp*width;
float a1_9 = (1.0f/9.0f);
float tmp;
for (int i=size2+bpp; i<size-size2-bpp; i++) {
tmp = a1_9 * ((float) image [i-size2-bpp] + (float) image [i-size2] + (float) image [i-size2+bpp] + (float) image [i-bpp] + (float) image [i] + (float) image [i+bpp] + (float)image [i+size2-bpp] + (float) image [i+size2] + (float) image [i+size2+bpp]);
image [i] = char (tmp);
float temp = (float) image [i];
}
I printed the values for one interaction of the racing car screenshot, corresponding to the values of the position one million and got this:
Image values are: -56 -57 -57 9 -43 -41 108 108 109
tmp it is: 8.88889
temp it is: 8
The values seems about right at first glance (doing the average on hand), so I don't have much of an idea of what's going wrong. Any help will be appreciated.
2 Thoughts about your Algorithm:
1.) You are using RGB color space? Then why are your Image values negative? Also you do not have to convert to float. Just add up the Integer Values and divide it by 9. This is much faster and in the end you cast it back to char anyway so it should be the same result.
2.) You overwrite your image in every iteration step, this means in the filter Pattern:
-------
|1|2|3|
-------
|4|5|6|
-------
|7|8|9|
-------
the values 1,2,3 and 4 are already smoothed, and 5 is calculated from 5 unsmoothend and 4 smoothed pixels.
I'd suggest creating a new blank image and storing the result (temp) in the new image while reading from the original one - i.e. do not try to process "in-place" with the output image the same as the input image.
I have implemented a 3x3 blur filter on BGR image format.
Important notices:
- When filtering and image in BRG (or RGB) format, you must filter each color channel separately.
- When executing a filter (using convolution) it is important not to use "in place" computation - avoid source overwritten by destination elements.
- Multiply by (1/9) is much for efficient than dividing by 9.
- Adding 0.5 before integer cast can be used for rounding a positive value.
- Fixed point implementation of (1/9) scaling performed by expanding, scaling and shifting. Example: avg = (sum*scale + rounding) >> 15; [When scale = (1/9)*2^15].
I assumed pixel order in memory is b,g,r,b,g,r... (b is in byte 0).
When filtering an image in BRG (or RGB) format, you must filter each color channel separately!
Illustration of color separation:
bgrbgrbgr
bgrbgrbgr
bgrbgrbgr
Separate to:
bbb ggg rrr
bbb ggg rrr
bbb ggg rrr
The blur filter kernel is:
1/9, 1/9, 1/9
1/9, 1/9, 1/9
1/9, 1/9, 1/9
As mentioned, The filter should be applied to Blue, Green and Red separately.
You can sum each 3x3 pixels and divide the result by 9 (or better multiply by 1/9 like you did).
I used fixed point implementation of multiply by 1/9, instead of converting to float.
The fixed point implementation is used for performance improvement (not so important here, because the code is not optimized).
I wonted to demonstrate the technique...
I used some code duplication, in order to emphasis the RGB color separation.
I also handled the image boundaries (use symmetric mirroring).
Here is my code:
//Filter image I with filter kernel:
// 1/9 1/9 1/9
// 1/9 1/9 1/9
// 1/9 1/9 1/9
//I - Input image in pixel ordered BGR format
//image_width - Number of columns of I
//image_height - Number of rows of I
//J - Destination "smoothed" image in BGR format.
//I and J is pixel ordered BGR color format (size in bytes is image_width*image_height*3):
//BRGBRGBRGBRGBR
//BRGBRGBRGBRGBR
//BRGBRGBRGBRGBR
//BRGBRGBRGBRGBR
//
//Limitations:
//I and J must be two separate arrays (in place computation is not supported).
void BgrSmoothing(const unsigned char I[],
int image_width,
int image_height,
unsigned char J[])
{
const int scale = (int)((1.0/9.0)*(1 << 15) + 0.5); //1/9 expanded by 2^15 (add 0.5 for rounding).
const int rounding_ofs = (1 << 14); //0.5 expanded by 2^14
int x, y;
const unsigned char *I0; //Points beginning of row y-1 (in source image I).
const unsigned char *I1; //Points beginning of row y (in source image I).
const unsigned char *I2; //Points beginning of row y+1 (in source image I).
int x0, x1, x2; //x0 - pixel to the left of x1, x1 - center, x2 - pixel to the right of x1
unsigned char *J1; //Points beginning of row y (in destination image J).
//3x3 source blue pixels, 3x3 source green pixels, 3x3 source red pixels.
unsigned char b00, b01, b02, g00, g01, g02, r00, r01, r02;
unsigned char b10, b11, b12, g10, g11, g12, r10, r11, r12;
unsigned char b20, b21, b22, g20, g21, g22, r20, r21, r22;
unsigned char b, g, r; //Destination blue, green and red pixels.
for (y = 0; y < image_height; y++)
{
if (y == 0)
I0 = I; //Handle first row: use row 0 instead of row -1 (row -1 exceeds image bounds).
else
I0 = &I[(y-1)*image_width*3]; //Pointer to beginning of source row above row y in image I.
I1 = &I[y*image_width*3]; //Pointer to beginning of source row y in image I.
if (y == image_height-1)
I2 = &I[y*image_width*3]; //Handle last row: use row image_height-1 instead of row image_height (row image_height exceeds image bounds).
else
I2 = &I[(y+1)*image_width*3]; //Pointer to beginning of source row below row y in image I.
J1 = &J[y*image_width*3]; //Pointer to beginning of destination row in image J.
//Handle first pixel:
for (x = 0; x < image_width; x++)
{
//Multiply x by 3, to convert pixel index to byte index (in BGR forst each pixel is 3 bytes).
x0 = (x == 0) ? (0) : (x-1)*3; //Handle x0 coordinate of first pixel in the row.
x1 = x*3;
x2 = (x == image_width-1) ? (image_width-1)*3 : (x+1)*3; //Handle x2 coordinate of last pixel in the row.
//Load 3x3 blue pixels:
b00 = I0[x0]; b01 = I0[x1], b02 = I0[x2];
b10 = I1[x0]; b11 = I1[x1], b12 = I1[x2];
b20 = I2[x0]; b21 = I2[x1], b22 = I2[x2];
//Load 3x3 green pixels:
g00 = I0[x0+1]; g01 = I0[x1+1], g02 = I0[x2+1];
g10 = I1[x0+1]; g11 = I1[x1+1], g12 = I1[x2+1];
g20 = I2[x0+1]; g21 = I2[x1+1], g22 = I2[x2+1];
//Load 3x3 red pixels:
r00 = I0[x0+2]; r01 = I0[x1+2], r02 = I0[x2+2];
r10 = I1[x0+2]; r11 = I1[x1+2], r12 = I1[x2+2];
r20 = I2[x0+2]; r21 = I2[x1+2], r22 = I2[x2+2];
//Sum all 9 blue elements, all 9 green elements and all 9 red elements (convert to int to avoid overflow).
int sum_b = (int)b00+(int)b01+(int)b02+(int)b10+(int)b11+(int)b12+(int)b20+(int)b21+(int)b22;
int sum_g = (int)g00+(int)g01+(int)g02+(int)g10+(int)g11+(int)g12+(int)g20+(int)g21+(int)g22;
int sum_r = (int)r00+(int)r01+(int)r02+(int)r10+(int)r11+(int)r12+(int)r20+(int)r21+(int)r22;
//b = round(sum_b*(1/9)).
//Because b i positive round(b) = floor(b+0.5).
//Use following computation instead: b = floor((sum_b*(1.0/9.0)*2^15 + 2^14) / 2^15)
b = (unsigned char)((sum_b*scale + rounding_ofs) >> 15); //Destination blue pixel.
g = (unsigned char)((sum_g*scale + rounding_ofs) >> 15); //Destination green pixel.
r = (unsigned char)((sum_r*scale + rounding_ofs) >> 15); //Destination red pixel.
//Store b,g,r elements to destination row J1.
J1[x1] = b;
J1[x1+1] = g;
J1[x1+2] = r;
}
}
}
Input image:
Output image: