I am using VLFEAT implementation of SIFT to compute SIFT descriptors on two set of images: queries and database images. Given a set of queries, I'd like to obtain the closest descriptors from a big database of descriptors, for which I use vl_ubcmatch.
Having vl_ubcmatch syntax as MATCHES = vl_ubcmatch(DESCR1, DESCR2)
I obtain different results if I input the query descriptors first and the database descriptors as a second parameter or the other way around.
Which is the correct syntax?
1) MATCHES = vl_ubcmatch(QUERY_DESCR,DATABASE_DESCR)
or
2) MATCHES = vl_ubcmatch(DATABASE_DESCR,QUERY_DESCR)
MATCHES = vl_ubcmatch(DESCR1, DESCR2)
for each descriptor in DESCR1
searches for the closest descriptor in DESCR2
and adds it to the output if the match passes the test (for more details see deltheil's answer).
So I believe MATCHES = vl_ubcmatch(QUERY_DESCR,DATABASE_DESCR)
is the variant you want.
I obtain different results if I input the query descriptors first and the database descriptors as a second parameter or the other way around.
This is because this method uses the ratio test[1] algorithm behind the scenes, i.e comparing the distance of the closest neighbor to that of the second-closest neighbor.
vl_feat implementation uses by default a threshold of 1.5
as follow:
if(thresh * (float) best < (float) second_best) {
/* accept the match */
}
This ratio test is not symmetric, that's why you could obtain differences between the set of matches when you swap the inputs.
If you are not comfortable with it, you can refer to Computer Vision Programming using the OpenCV Library Chapter 9 which suggests a pragmatic way to symmetrize the matching as follow:
From these [matching] sets, we will now extract the matches that are in agreement
with both sets. This is the symmetrical matching scheme imposing that,
for a match pair to be accepted, both points must be the best matching
feature of the other.
[1] see 7.1 Keypoint Matching from D. Lowe's paper.