I am somewhat at the limits of my knowledge here, but I read a paper and have worked out a way to calculate image entropy with ImageMagick - some clever person might like to check it!
#!/bin/bash
image=$1
# Get number of pixels in image
px=$(convert -format "%w*%h\n" "$image" info:|bc)
# Calculate entropy
# See this paper www1.idc.ac.il/toky/imageProc-10/Lectures/04_histogram_10.ppt
convert "$image" -colorspace gray -depth 8 -format "%c" histogram:info:- | \
awk -F: -v px=$px '{p=$1/px;e+=-p*log(p)} END {print e}'
So, you would save the script above as entropy, then do the following once to make it executable:
chmod +x entropy
Then you can use it like this:
entropy image.jpg
It does seem to produce bigger numbers for true photos and lower numbers for computer graphics.
Another idea would be to look at the inter-channel correlation. Normally, on digital photos, the different wavelengths of light are quite strongly correlated with each other, so if the red component increases the green and the blue components tend to also increase, but if the red component decreases, both the green and the blue tend to also decrease. If you compare that to computer graphics, people tend to do their graphics with big bold primary colours, so a big red bar-graph or pie-chart graphic will not tend to be at all correlated between the channels. I took a digital photo of a landscape and resized it to be 1 pixel wide and 64 pixels high, and I am showing it using ImageMagick below - you will see that where red goes down so do green and blue...
Statistically, this is the covariance. I would tend to want to use red and green channels of a photo to evaluate this - because in a Bayer grid there are two green sites for each single red and blue site, so the green channel is averaged across the two and therefore least susceptible to noise. The blue is most susceptible to noise. So the code for measuring the covariance can be written like this:
#!/bin/bash
# Calculate Red Green covariance of image supplied as parameter
image=$1
convert "$image" -depth 8 txt: | awk ' \
{split($2,a,",")
sub(/\(/,"",a[1]);R[NR]=a[1];
G[NR]=a[2];
# sub(/\)/,"",a[3]);B[NR]=a[3]
}
END{
# Calculate mean of R,G and B
for(i=1;i<=NR;i++){
Rmean=Rmean+R[i]
Gmean=Gmean+G[i]
#Bmean=Bmean+B[i]
}
Rmean=Rmean/NR
Gmean=Gmean/NR
#Bmean=Bmean/NR
# Calculate Green-Red and Green-Blue covariance
for(i=1;i<=NR;i++){
GRcov+=(G[i]-Gmean)*(R[i]-Rmean)
#GBcov+=(G[i]-Gmean)*(B[i]-Bmean)
}
GRcov=GRcov/NR
#GBcov=GBcov/NR
print "Green Red covariance: ",GRcov
#print "GBcovariance: ",GBcov
}'
I did some testing and that also works quite well - however graphics with big white or black backgrounds appear to be well correlated too because red=green=blue on white and black (and all grey-toned areas) so you would need to be careful of them. That however leads to another thought, photos almost never have pure white or black (unless really poorly exposed) whereas graphics do have whit backgrounds, so another test you could use would be to calculate the number of solid black and white pixels like this:
I am somewhat at the limits of my knowledge here, but I read a paper and have worked out a way to calculate image entropy with ImageMagick - some clever person might like to check it!
So, you would save the script above as
entropy
, then do the following once to make it executable:Then you can use it like this:
It does seem to produce bigger numbers for true photos and lower numbers for computer graphics.
Another idea would be to look at the inter-channel correlation. Normally, on digital photos, the different wavelengths of light are quite strongly correlated with each other, so if the red component increases the green and the blue components tend to also increase, but if the red component decreases, both the green and the blue tend to also decrease. If you compare that to computer graphics, people tend to do their graphics with big bold primary colours, so a big red bar-graph or pie-chart graphic will not tend to be at all correlated between the channels. I took a digital photo of a landscape and resized it to be 1 pixel wide and 64 pixels high, and I am showing it using ImageMagick below - you will see that where red goes down so do green and blue...
Statistically, this is the covariance. I would tend to want to use red and green channels of a photo to evaluate this - because in a Bayer grid there are two green sites for each single red and blue site, so the green channel is averaged across the two and therefore least susceptible to noise. The blue is most susceptible to noise. So the code for measuring the covariance can be written like this:
I did some testing and that also works quite well - however graphics with big white or black backgrounds appear to be well correlated too because red=green=blue on white and black (and all grey-toned areas) so you would need to be careful of them. That however leads to another thought, photos almost never have pure white or black (unless really poorly exposed) whereas graphics do have whit backgrounds, so another test you could use would be to calculate the number of solid black and white pixels like this:
This one has 2 black and 537 pure white pixels.
I should imagine you probably have enough for a decent heuristic now!
Following on from my comment, you can use these ImageMagick commands:
Other parameters may be suggested by other responders, and you can find most of that using:
Compute the entropy of the image. Artificial images usually have much lower entropy than photographs.