I am trying to make an automator tool and am experimenting with a type of recording which takes screen shots and records user inputs. The idea would be for user to take a snapshot and and highlight a square on the snapshot of the "submit" button. During playback, the program would take a sceenshot of the open window, and find the coordinates of the button by searching for the snapshot. So I need an algorithm to search an image for an exact (or very close) image of the button. The algorithms I've found so far compare image likeness but cannot find it in a subimage, and algorithms for object recognition seem a bit over the top considering the "object" im trying to find will be a near perfect match. Any ideas?
相关问题
- Finding k smallest elements in a min heap - worst-
- binary search tree path list
- High cost encryption but less cost decryption
- How to get a fixed number of evenly spaced points
- Space complexity of validation of a binary search
相关文章
- What are the problems associated to Best First Sea
- Coin change DP solution to keep track of coins
- Algorithm for partially filling a polygonal mesh
- Robust polygon normal calculation
- Algorithm for maximizing coverage of rectangular a
- How to measure complexity of a string?
- Select unique/deduplication in SSE/AVX
- How to smooth the blocks of a 3D voxel world?
What you need is an efficient feature extraction method. This will depend on what you're looking for, but let's assume you're looking for the Send button in this image:
One of the characteristic features of this button is that it includes a pair of parallel line segments at the top and bottom. The same applies to the two text input fields, but for the button, this offset is exactly 17 pixels.
This is what you get if you calculate the maximum pixel values of the source image together with itself shifted vertically by 17 pixels:
The Send button now appears as a solid horizontal line. You can detect this quite easily by thresholding the image and looking for an unbroken sequence of black pixels. Just for reference, here's what I obtained after applying a 10px horizontal motion blur and thresholding at a grey level of 128:
This process will identify candidate positions quite quickly. You can then subject these locations to stronger techniques like 2D convolution and OCR without too much loss of performance.
The following tools can help you with that:
find a distinct feature in the button image
for example can use edge color neighboring the button face color or derivation, shape or average color of square sub image (8x8 pixels ...)
search the snapshot for this feature
I would use average color for start so divide image to
N x N
pixel areas and compute their average color. If you find square with similar average color to your button average colors then you have probable location.after this you can brute force attack the near area if it has your button
in this stage do not compare your colors directly (can be distorted by anti-aliasing and filters ...). Better way would be to compare derivations
+/-
some accuracy. You can make an coefficient of probable button presence:and if it is close enough to
1.0
then you found your button.PS. in stage 3 you can use Grayscale images to simplify things