Here's what I would like to do:
I'm taking pictures with a webcam at regular intervals. Sort of like a time lapse thing. However, if nothing has really changed, that is, the picture pretty much looks the same, I don't want to store the latest snapshot.
I imagine there's some way of quantifying the difference, and I would have to empirically determine a threshold.
I'm looking for simplicity rather than perfection. I'm using python.
I am addressing specifically the question of how to compute if they are "different enough". I assume you can figure out how to subtract the pixels one by one.
First, I would take a bunch of images with nothing changing, and find out the maximum amount that any pixel changes just because of variations in the capture, noise in the imaging system, JPEG compression artifacts, and moment-to-moment changes in lighting. Perhaps you'll find that 1 or 2 bit differences are to be expected even when nothing moves.
Then for the "real" test, you want a criterion like this:
So, perhaps, if E = 0.02, P = 1000, that would mean (approximately) that it would be "different" if any single pixel changes by more than ~5 units (assuming 8-bit images), or if more than 1000 pixels had any errors at all.
This is intended mainly as a good "triage" technique to quickly identify images that are close enough to not need further examination. The images that "fail" may then more to a more elaborate/expensive technique that wouldn't have false positives if the camera shook bit, for example, or was more robust to lighting changes.
I run an open source project, OpenImageIO, that contains a utility called "idiff" that compares differences with thresholds like this (even more elaborate, actually). Even if you don't want to use this software, you may want to look at the source to see how we did it. It's used commercially quite a bit and this thresholding technique was developed so that we could have a test suite for rendering and image processing software, with "reference images" that might have small differences from platform-to-platform or as we made minor tweaks to tha algorithms, so we wanted a "match within tolerance" operation.
you can compute the histogram of both the images and then calculate the Bhattacharyya Coefficient, this is a very fast algorithm and I have used it to detect shot changes in a cricket video (in C using openCV)
A simple solution:
Encode the image as a jpeg and look for a substantial change in filesize.
I've implemented something similar with video thumbnails, and had a lot of success and scalability.
You can compare two images using functions from PIL.
The diff object is an image in which every pixel is the result of the subtraction of the color values of that pixel in the second image from the first image. Using the diff image you can do several things. The simplest one is the
diff.getbbox()
function. It will tell you the minimal rectangle that contains all the changes between your two images.You can probably implement approximations of the other stuff mentioned here using functions from PIL as well.
I had a similar problem at work, I was rewriting our image transform endpoint and I wanted to check that the new version was producing the same or nearly the same output as the old version. So I wrote this:
https://github.com/nicolashahn/diffimg
Which operates on images of the same size, and at a per-pixel level, measures the difference in values at each channel: R, G, B(, A), takes the average difference of those channels, and then averages the difference over all pixels, and returns a ratio.
For example, with a 10x10 image of white pixels, and the same image but one pixel has changed to red, the difference at that pixel is 1/3 or 0.33... (RGB 0,0,0 vs 255,0,0) and at all other pixels is 0. With 100 pixels total, 0.33.../100 = a ~0.33% difference in image.
I believe this would work perfectly for OP's project (I realize this is a very old post now, but posting for future StackOverflowers who also want to compare images in python).
What about calculating the Manhattan Distance of the two images. That gives you n*n values. Then you could do something like an row average to reduce to n values and a function over that to get one single value.