OpenCV's remap()
uses a real-valued index grid to sample a grid of values from an image using bilinear interpolation, and returns the grid of samples as a new image.
To be precise, let:
A = an image
X = a grid of real-valued X coords into the image.
Y = a grid of real-valued Y coords into the image.
B = remap(A, X, Y)
Then for all pixel coordinates i, j,
B[i, j] = A(X[i, j], Y[i, j])
Where the round-braces notation A(x, y)
denotes using bilinear interpolation to solve for the pixel value of image A using float-valued coords x
and y
.
My question is: given an index grid X
, Y
, how can I generate an "inverse grid" X^-1
, Y^-1
such that:
X(X^-1[i, j], Y^-1[i, j]) = i
Y(X^-1[i, j], Y^-1[i, j]) = j
And
X^-1(X[i, j], Y[i, j]) = i
Y^-1(X[i, j], Y[i, j]) = j
For all integer pixel coordinates i, j
?
FWIW, the image and index maps X and Y are the same shape. However, there is no a priori structure to the index maps X and Y. For example, they're not necessarily affine or rigid transforms. They may even be uninvertible, e.g. if X, Y
maps multiple pixels in A
to the same exact pixel coordinate in B. I'm looking for ideas for a method that will find a reasonable inverse map if one exists.
The solution need not be OpenCV-based, as I'm not using OpenCV, but another library that has a remap()
implementation. While any suggestions are welcome, I'm particularly keen on something that's "mathematically correct", i.e. if my map M is perfectly invertible, the method should find the perfect inverse, within some small margin of machine precision.
If you map is derived from a homography
H
you could invertH
and directly create the inverse maps withcv::initUndistortRectifyMap()
.e.g. in Python:
The OpenCV documentation states about
initUndistortRectifyMap()
:In the case you have just given the maps, you have to do it by yourself. Hoewever, interpolation of the new maps' coordinates is not trivial, because the support region for one pixel could be very large.
Here is a simple Python solution which inverts the maps by doing point-to-point mapping. This will probably leave some coordinates unassigned, while others will be updated several times. So there may be holes in the map.
Here is a small Python program demonstrating both approaches:
Well I just had to solve this remap inversion problem myself and I'll outline my solution.
Given
X
,Y
for theremap()
function that does the following:I computed
Xinv
,Yinv
that can be used by theremap()
function to invert the process:First I build a KD-Tree for the 2D point set
{(X[i,j],Y[i,j]}
so I can efficiently find theN
nearest neighbors to a given point(x,y).
I use Euclidian distance for my distance metric. I found a great C++ header lib for KD-Trees on GitHub.Then I loop thru all the
(x,y)
values inA
's grid and find theN = 5
nearest neighbors{(X[i_k,j_k],Y[i_k,j_k]) | k = 0 .. N-1}
in my point set.If distance
d_k == 0
for somek
thenXinv[x,y] = i_k
andYinv[x,y] = j_k
, otherwise...Use Inverse Distance Weighting (IDW) to compute an interpolated value:
w_k = 1 / pow(d_k, p)
(I usep = 2
)Xinv[x,y] = (sum_k w_k * i_k)/(sum_k w_k)
Yinv[x,y] = (sum_k w_k * j_k)/(sum_k w_k)
Note that if
B
is aW x H
image thenX
andY
areW x H
arrays of floats. IfA
is aw x h
image thenXinv
andYinv
arew x h
arrays for floats. It is important that you are consistent with image and map sizing.Works like a charm! My first version I tried brute forcing the search and I never even waited for it to finish. I switched to a KD-Tree then I started to get reasonable run times. I f I ever get time I would like to add this to OpenCV.
The second image below is use
remap()
to remove the lens distortion from the first image. The third image is a result of inverting the process.There is no any standard way to do it with OpenCV.
If you are looking for a complete ready-to-use solution, I am not sure that I can help, but I can at least describe a method that I used some years ago to do this task.
First of all, you should create remapping maps with the same dimension as your source image. I created maps with larger dimensions for simpler interpolation, and at final step cropped them to proper size. Then you should fill them with values existing in previous remapping maps (not so difficult: just iterate over them and if maps coordinates x and y lays in limits of your image, take their row and column as new y and x, and place into old x and y column and row of the new map). It is rather simple solution,but it gives rather good result. For perfect one you should interpolate old x and y to integer values using your interpolation method and neighbour pixels.
After this you should either actually remap pixel colors manually, or completely fill your remapping map with pixel coordinates and use version from OpenCV.
You will meet rather challenging task: you should interpolate pixels in empty areas. In other words, you should take distances to closest non-zero pixel coordinates and mix color (if you remap colors) or coordinates (if you proceed with full maps computation) fractions according to these distances. Actually it is also not so difficult for linear interpolation, and you can even look into
remap()
implementation in OpenCV github page. For NN interpolation it will me much simpler - just take color/coordinate of nearest neighbour.And a final task is extrapolation of areas out of borders of remapped pixels area. Also algorithm from OpenCV can be used as a reference.
OP here. I think I've found an answer. I haven't implemented it yet, and if someone comes up with a less fiddly solution (or finds something wrong with this one), I'll choose their answer instead.
Problem statement
Let A be the source image, B be the destination image, and M be the mapping from A's coords to B's coords, i.e.:
...where square braces indicate array lookup with integer indices, and circular braces indicate bilinear interpolation lookup with floating-point indices. We restate the above using the more economical notation:
We wish to find an inverse mapping N that maps B back to A as best as is possible:
The problem can be stated without reference to A or B:
...where
||*||
indicates the Frobenius norm, andI_n
is the identity map with the same dimensions as N, i.e. a map where:Naive solution
If M's values are all integers, and M is an isomorphism, then you can construct N directly as:
Or in our simplified notation:
...where I_m is the identity map with the same dimensions as M.
There are two problems:
Solution
Construct empty N as a 3D tensor of floats:
For each coordinate [i, j] in A's coordinate space, do:
The potentially expensive step here would be the search in step 1 for the 2x2 grid of A-coordinates in M that encircles [i, j]. A brute-force search would make this whole algorithm O(n*m) where n is the number of pixels in A, and m the number of pixels in B.
To reduce this to O(n), one could instead run a scanline algorithm within each A-coordinate quadrilateral to identify all the integer-valued coordinates [i, j] it contains. This could be precomputed as a hashmap that maps integer-valued A coords [i, j] to the upper-left corner of its encircling quadrilateral's B coords [k, l].
From what I understand you have an original image, and a transformed image, and you wish to recover the nature of the transform that has been applied without knowing it, but assuming it is something sensible, like a rotation or a fish-eye distort.
What I would try is thresholding the image to convert it to binary, in both the index image and the plain image. Then try to identify objects. Most mappings will at least retain connectivity and Euler number, mostly the largest object in the index will still be the largest object in the plain.
Then take moments for your matched image / indexed pairs and see if you can remove translation, rotation and scaling. That gives you several reverse maps, which you can then try to stitch together. (Hard if the transform is not simple, but the general problem of reconstituting just any transformation cannot be solved).