I'm trying to minimize a following sample function:
F(x) = f[0]^2(x[0],...,x[n-1]) + ... + f[m-1]^2(x[0],...,x[n-1])
A normal way to minimize such a funct could be the Levenberg-Marquardt algorithm. I would like to perform this minimization in c++ and have done some initial tests with Eigen that resulted in the expected solution.
My question is the following:
I'm used to optimization in python using i.e. scipy.optimize.fmin_powell
. Here
the input function parameters are (func, x0, args=(), xtol=0.0001, ftol=0.0001, maxiter=None, maxfun=None, full_output=0, disp=1, retall=0, callback=None, direc=None)
.
So I can define a func(x0)
, give the x0
vector and start optimizing. If needed I can change
the optimization parameters.
Now the Eigen Lev-Marq algorithm works in a different way. I need to define a function
vector (why?) Furthermore I can't manage to set the optimization parameters.
According to:
http://eigen.tuxfamily.org/dox/unsupported/classEigen_1_1LevenbergMarquardt.html
I should be able to use the setEpsilon()
and other set functions.
But when I have the following code:
my_functor functor;
Eigen::NumericalDiff<my_functor> numDiff(functor);
Eigen::LevenbergMarquardt<Eigen::NumericalDiff<my_functor>,double> lm(numDiff);
lm.setEpsilon(); //doesn't exist!
So I have 2 questions:
Why is a function vector needed and why wouldn't a function scalar be enough?
References where I've searched for an answer:
http://www.ultimatepp.org/reference$Eigen_demo$en-us.html
http://www.alglib.net/optimization/levenbergmarquardt.phpHow do I set the optimization parameters using the set functions?
As an alternative you may simply create a new functor like this,
and then initialize the LevenbergMarquardt instance using like this,
Personally, I find this approach a bit cleaner.
It seems that the function is more general:
You will need to supply n error values in the fvec. Also, there is no need to square the f-values because it is implicitly assumed that the overall error function is made up of the sum of squares of the fvec components.
So, if you follow these guidelines and change the code to:
It will converge in a ridiculously small number of iterations - like less than 5. I also tried it on a more complex example - the Hahn1 benchmark at http://www.itl.nist.gov/div898/strd/nls/data/hahn1.shtml with m=7 parameters and n=236 observations and it converges to the known right solution in only 11 iterations with the numerically computed Jacobian.
So I believe I've found the answers.
1) The function is able to work as a function vector and as a function scalar.
If there are
m
solveable parameters, a Jacobian matrix of m x m needs to be created or numerically calculated. In order to do a Matrix-Vector multiplicationJ(x[m]).transpose*f(x[m])
the function vectorf(x)
should havem
items. This can be them
different functions, but we can also givef1
the complete function and make the other items0
.2) The parameters can be set and read using
lm.parameters.maxfev = 2000;
Both answers have been tested in the following example code:
This answer is an extension of two existing answers: 1) I adapted the source code provided by @Deepfreeze to include additional comments and two different test functions. 2) I use the suggestion from @user3361661 to rewrite the objective function in the correct form. As he suggested, it reduced the iteration count on my first test problem from 67 to 4.
The output at the command line from running this test script is: