I'm a bit stuck on designing a color detection system - I can't quite figure out a way to do it easily.
-
Basically, I have a library of images, that I want to sort by color. So if the user specifies 'sort by blue', then the most blue images will appear at the top of the results, with the least blue appearing at the bottom.
The problem is that the images aren't all one color, so it is doing two things at the same time:
1 - finding the bluest part of the image
2 - ranking this blue color (based on color hue and amount of this color).
I've tried about 3 or 4 different approaches, with varying results - none work well though, and 2 of these were quite mathematical algorithms (which all work much better on paper than in practice haha).
-
What different ways could I go about the whole process? I'm probably missing some really obvious ways it could work - any help or ideas would be much appreciated :)
-
EDIT: Thanks for all the responses - here's what I've tried so far:
getting the average rgb value for the whole image and comparing it to blue. Comparing was done using normalised rgb 3 space vectors and finding distances between them. This works the least well, an image with no blue could easily appear above an image with partial very strong blue.
finding the dominant color and comparing it to blue (again using 3 space vector distances). This didn't work as there might have been a large blue section of the image that wasn't the most (or in the top couple) of dominant color sections.
finding pixels that are close to blue, averaging all of these and comparing the answer to actual blue.
finding all the pixels that are close to blue, incrementing a count and finding a percentage based on count/total pixels.
Do you really need to find the bluest part of the image? Why not just rank the "blueness" of an image as the average blue-component value for all pixels?
Another possibility would be to find the density of pixels that pass a threshold, or minimum blue value necessary to qualify as a blue pixel.
Two thoughts come to mind:
Cheap version: convert images to HSV color space, and for each pixel compute
cos(H - target_hue)
or a reasonable approximation (for blue,target_hue
would be 240 degrees), multiply by saturation, and average that quantity over all of the pixels in the image. High values are best. Note that colors that are closer to yellow than to blue have "negative blueness", and that black, white, and pure gray have equally "zero blueness". Note that you really want HSV, not HSL, in this situation, because the "S" in HSL doesn't map well to perceptual saturation. For example, the color #f8f8ff (RGB 248, 248, 255) has a saturation of 100% in HSL (i.e. a pure blue), but it looks nearly white. The same color in HSV has an "S" coordinate of only 3%, which is reasonable.Less cheap version: convert images to CIELAB color space, discard L, and compute the distance in a*b* space between each pixel and the target color, then average or RMS over each pixel. Low values are best.
I think to measure "blueness" you'll need to take all three components into account, not just the blue. Just for example, [255,255,255] is pure white, not blue -- but [0, 0, 30] is pure blue, even though its blue component is much lower in value.
Alternatively, you could convert to something like HSL or HSV, in which case the "blueness" should be a bit simpler to measure (hue and saturation only).
I'd google for an algorythm for creating 256 colour palettes from 24bit images (see http://en.wikipedia.org/wiki/Color_quantization for more info) then see which colours in this palette dominate if the image was mapped to it. ie, running a tally for each 256 palette entry of how many pixels get mapped into it.
notes, you of course don't need the whole 256, it's just saying 256 to help explain my thinking. also by directly studying the algorythim for this palette generation might directly give you an answer.
I would say to take the average of the RGB value itself over the whole picture. I would say that the pseudo below should give you the "average blue" of the picture.
If this doesn't work out; then I would think that you would need to rank a "blue" pixel as being higher/lower weighted based on the G/B values. Then add up your weighted value(s) and compare those.
If you have one pixel, I'd say its blueness in terms of RGB is the the value of B / (R + G + B), so 1 is totally blue and 0 is not blue at all and white is 1/3 blue. (Watch out for black, which is a special case.) And the blueness of an image is the average blueness of its pixels. And if that's too costly, just take the average of a fixed number of randomly-chosen pixels.