OpenCV reference manual (2.4.x) states that the constructor that initializes MSER requires the following parameters:
delta, min_area, max_area, max_variation, min_diversity, max_evolution, area_threshold,
min_margin, edge_blur_size.
I am dealing with grayscale images. What is the use of the parameters "delta", "max_variation" and "min_diversity"? What property of an MSER do these parameters help control?
I have tried a lot to find the exact answer to this and I could only find a little information on the following pages (none of which was particularly useful in telling me what exactly do these 3 parameters control):
1. OpenCV wiki
2. Wikipedia description of MSER
3. MSER questions on STackOverflow
Please help!
I am going to presume that you know the basics of how MSER feature detection works (if not, Wikipedia, and short recap follows).
You have two types of MSER regions, positive and negative.
First type, you get by thresholding with all intensities (for grayscale images, 0
to 255
). E.g. for a threshold T = 100
, all pixels with intensity < 100
are assigned black
, or foreground
, and all pixels >= 100
intensity are white
or background
.
Now, imagine you're observing a specific pixel p
. At some threshold, let's call it T1
, it will start belonging to the foreground and stay that way until T=255
. At T1
a pixel will belong to a component CC_T1(p)
. 5
gray levels later, it will belong to the component CC_(T1+5)(p)
.
All of these connected components, obtained for all the thresholds, are potential candidates for MSER. (Other type of components is obtained if you reverse my black/foreground
and white/background
assignments for thresholding).
Parameters help decide which potential candidates are indeed maximally stable:
delta
For every region, variation is measured:
V_T = (size(CC_T(p))-size(CC_{T-delta}(p)))/size(CC_{T-delta}(p))
for every possible threshold Ti
. If this variation for a pixels is a local minimum of a variation, that is, V_T < V_{T-1}
and V_T < V_{T+1}
, the region is maximally stable.
The parameter delta indicates through how many different gray levels does a region need to be stable to be considered maximally stable. For a larger delta, you will get less regions.
note: In the original paper introducing MSER regions, the actual formula is:
V_T = (size(CC_{T+delta}(p))-size(CC_{T-delta}(p)))/size(CC_T(p))
The OpenCV implementation uses a slightly different formula to speed up the feature extraction.
minArea, maxArea
If a region is maximally stable, it can still be rejected if it has less than minArea pixels or more than maxArea pixels.
maxVariation
Back to the variation from point 1 (the same function as for delta): if a region is maximally stable, it can still be rejected if the the regions variation is bigger than maxVariation.
That is, even if the region is "relatively" stable (more stable than the neigbouring regions), it may not be "absolutely" stable enough. For smaller maxVariation, you will get less regions
minDiversity
This parameter exists to prune regions that are too similar (e.g. differ for only a few pixels).
For a region CC_T1(p)
that is maximally stable, find a region CC_T2(p)
which is the "parent maximally stable region". That means, T2 > T1
, CC_T2(p)
is a maximally stable region and there is no T2 > Tx > T1
such that CC_Tx(p)
is maximally stable. Now, compare how much bigger the parent is:
diversity = (size(CC_T2(p)) - size(CC_T1(p))) / size(CC_T1(p))
If this diversity
is smaller than maxDiversity, remove the region CC_T1(p)
. For larger diversity, you will get less regions.
(For the exact formula for this parameter I had to dig through the program code)
I found the answer to my question thanks to this link.
MSERs are obtained by varying the intensity threshold T from 0 to 255. Delta decides the least count of that variation. So, size{i} means the size or area of region with intensity value i in a grayscale image.
Will get back with an explanation of MaxVariation and MinDiversity soon.
Matlab has an almost identical function MSER. In Mathworks you can find very well explained what are those parameters for, in my opinion.
I will copy the 2 inputs definitions you ask for (there are no options, in Matlab, for color images):
_delta. Step size between intensity threshold levels, specified as the comma-separated pair consisting of 'ThresholdDelta' and a numeric value in the range (0,100]. This value is expressed as a percentage of the input data type range used in selecting extremal regions while testing for their stability. Decrease this value to return more regions. Typical values range from 0.8 to 4.
_max_variation. Maximum area variation between extremal regions at varying intensity thresholds, specified as the comma-separated pair consisting of 'MaxAreaVariation' and a positive scalar value. Increasing this value returns a greater number of regions, but they may be less stable. Stable regions are very similar in size over varying intensity thresholds. Typical values range from 0.1 to 1.0.
However, as penelope says, the original paper is very useful for a more deep understanding of the full process. Also, I give this reference of a very interesting comparison between well-known feature detectors.