How do you detect the location of an image within a larger image? I have an unmodified copy of the image. This image is then changed to an arbitrary resolution and placed randomly within a much larger image which is of an arbitrary size. No other transformations are conducted on the resulting image. Python code would be ideal, and it would probably require libgd. If you know of a good approach to this problem you'll get a +1.
相关问题
- Views base64 encoded blob in HTML with PHP
- how to define constructor for Python's new Nam
- streaming md5sum of contents of a large remote tar
- How to get the background from multiple images by
- Evil ctypes hack in python
Take a look at Scale-Invariant Feature Transforms; there are many different flavors that may be more or less tailored to the type of images you happen to be working with.
There is a quick and dirty solution, and that's simply sliding a window over the target image and computing some measure of similarity at each location, then picking the location with the highest similarity. Then you compare the similarity to a threshold, if the score is above the threshold, you conclude the image is there and that's the location; if the score is below the threshold, then the image isn't there.
As a similarity measure, you can use normalized correlation or sum of squared differences (aka L2 norm). As people mentioned, this will not deal with scale changes. So you also rescale your original image multiple times and repeat the process above with each scaled version. Depending on the size of your input image and the range of possible scales, this may be good enough, and it's easy to implement.
A proper solution is to use affine invariants. Try looking up "wide-baseline stereo matching", people looked at that problem in that context. The methods that are used are generally something like this:
Preprocessing of the original image
At the end of this stage, you will have a set of descriptors.
Testing (with the new test image).
You probably want cross-correlation. (Autocorrelation is correlating a signal with itself; cross correlating is correlating two different signals.)
What correlation does for you, over simply checking for exact matches, is that it will tell you where the best matches are, and how good they are. Flip side is that, for a 2-D picture, it's something like O(N^3), and it's not that simple an algorithm. But it's magic once you get it to work.
EDIT: Aargh, you specified an arbitrary resize. That's going to break any correlation-based algorithm. Sorry, you're outside my experience now and SO won't let me delete this answer.
http://en.wikipedia.org/wiki/Autocorrelation is my first instinct.