I read some paper about LSH and I know that is used for solving the approximated k-NN problem. We can divide the algorithm in two parts:
Given a vector in
D
dimensions (whereD
is big) of any value, translate it with a set ofN
(whereN<<D
) hash functions to a binary vector inN
dimensions.Using hamming distance, apply some search technique on the set of given binary codes obtained from phase 1 to find the k-NN.
The keypoint is that computing the hamming distance for vectors in N
dimensions is fast using XOR.
Anyway, I have two questions:
Point 1. is still necessary if we use a binary descriptor, like ORB? Since ORB's descriptors are already binaries and we use the hamming distance for comparing them, why we should perform the first point?
How the conversion happens for images described by SIFT? Each SIFT descriptor is 128 bits and each image is described by a set of descriptors. So we have matrix
descX128
(wheredesc
is the number of descriptors used), while LSH usually accept as input a vector.
1) You can bypass it, but then you will work in
D
dimensions, notN
as you say. whereN << D
. That means that the algorithm has to adapt toD
dimensions as well.2) No.
Read SIFT from openCV:
Here is how I have it in my mind, hope that will be enough:
LSH takes as input a pointset, of
n
points, where every point lies ind
dimensions.So, a query is a point, in
d
dimensions and the goal is to find its NN*.Now every point, represents an image descriptor. So, we have
n
images in our dataset.The query, which is also a point (i.e. a vector with
d
coordinates), represents another image descriptor.We are seeking to match (i.e. to find the Nearest Neighbor) the query image descriptor with an image descriptor from our dataset.
So the conversion you are talking about is applied in a vector, not a matrix.
Edit:
Moreover, from our High-dimensional approximate nearest neighbor: k-d Generalized Randomized Forests paper, see this in the Experiments section:
*or the Fixed-radius near neighbors problem