I have seen apps, and wondered how can I programmatically take a picture of image. Define how it needs to be transformed so that it looks parallel to camera and not skewed perspective wise.
Then combine multiple photos to create a pdf file. For example this app does it: https://play.google.com/store/apps/details?id=com.appxy.tinyscan&hl=en
I do not use books for such trivial things so sorry I can not recommend any (especially in English). What you need to do is this:
- input image
find main contours
ideally whole grid but even outer contour will suffice (in case no grid is present). You need to divide the contour into horizontal (Red) and vertical (Green) curves (or set of points).
sample contour curves by 4 "equidistant" points
as the image is distorted (not just rotated) then we need to use at least bi-cubic interpolation. For that we need 16 points (Aqua) per patch.
add mirror points to cover whole grid
on the image are mirrored (Yellow) points only for horizontal contours you should do this also for vertical contours (did not fit me in the image and did not want to enlarge resolution just for that) and also for the corner points so you got 6x6
control points. The mirror can be done linearly (like I did).
Now the transformation is done like this:
- Process all pixels
dst(x0,y0)
of target image
Handle x,y
as parameter for cubic interpolation
if xs,ys
is target image resolution then:
u=(3.0*x)/xs
v=(3.0*y)/ys
Now cubic interpolation is usually done on parameter t=<0.0,1.0)
so
if u=<0.0,1.0>
use t=u
and control points 0,1,2,3
.
if u=<1.0,2.0)
use t=u-1.0
and control points 1,2,3,4
if u=<2.0,3.0>
use t=u-2.0
and control points 2,3,4,5
The same goes for vertical contours and v
. Compute xi,yi
as bi cubic interpolation of (u,v)
. And copy pixel:
dst(x,y)=src(xi,yi);
This is just nearest neighbor but you can also use bilinear for this ... As cubic curve I would use this polynomial.
The idea behind bi-cubic interpolation is easy. compute point corresponding to parameter u
on 4 horizontal contours. That will give you 4 control points for the final cubic interpolation in vertical direction and v
as parameter. Resulting coordinate is your source pixel position.
For more info see:
- How can i produce multi point linear interpolation?
- Bicubic interpolation
- OpenCV Birdseye view without loss of data
In case you do not have a grid use any info that can be used as one. For example lines of text can be considered a contour for this ...