I have 4 non-linear equations with three unknowns X
, Y
, and Z
that I want to solve for. The equations are of the form:
F(m) = X^2 + a(m)Y^2 + b(m)XYcosZ + c(m)XYsinZ
...where a
, b
and c
are constants which are dependent on each value of F
in the four equations.
What is the best way to go about solving this?
You probably want to be using scipy's nonlinear solvers, they're really easy: http://docs.scipy.org/doc/scipy/reference/optimize.nonlin.html
There are two ways to do this.
Setup
So, as I understand your question, you know F, a, b, and c at 4 different points, and you want to invert for the model parameters X, Y, and Z. We have 3 unknowns and 4 observed data points, so the problem is overdetermined. Therefore, we'll be solving in the least-squares sense.
It's more common to use the opposite terminology in this case, so let's flip your equation around. Instead of:
Let's write:
Where we know
F
,X
,Y
, andZ
at 4 different points (e.g.F_0, F_1, ... F_i
).We're just changing the names of the variables, not the equation itself. (This is more for my ease of thinking than anything else.)
Linear Solution
It's actually possible to linearize this equation. You can easily solve for
a^2
,b^2
,a b cos(c)
, anda b sin(c)
. To make this a bit easier, let's relabel things yet again:Now the equation is a lot simpler:
F_i = d + e X_i + f Y_i + g Z_i
. It's easy to do a least-squares linear inversion ford
,e
,f
, andg
. We can then geta
,b
, andc
from:Okay, let's write this up in matrix form. We're going to translate 4 observations of (the code we'll write will take any number of observations, but let's keep it concrete at the moment):
Into:
Or:
F = G * m
(I'm a geophysist, so we useG
for "Green's Functions" andm
for "Model Parameters". Usually we'd used
for "data" instead ofF
, as well.)In python, this would translate to:
Non-linear Solution
You could also solve this using
scipy.optimize
, as @Joe suggested. The most accessible function inscipy.optimize
isscipy.optimize.curve_fit
which uses a Levenberg-Marquardt method by default.Levenberg-Marquardt is a "hill climbing" algorithm (well, it goes downhill, in this case, but the term is used anyway). In a sense, you make an initial guess of the model parameters (all ones, by default in
scipy.optimize
) and follow the slope ofobserved - predicted
in your parameter space downhill to the bottom.Caveat: Picking the right non-linear inversion method, initial guess, and tuning the parameters of the method is very much a "dark art". You only learn it by doing it, and there are a lot of situations where things won't work properly. Levenberg-Marquardt is a good general method if your parameter space is fairly smooth (this one should be). There are a lot of others (including genetic algorithms, neural nets, etc in addition to more common methods like simulated annealing) that are better in other situations. I'm not going to delve into that part here.
There is one common gotcha that some optimization toolkits try to correct for that
scipy.optimize
doesn't try to handle. If your model parameters have different magnitudes (e.g.a=1, b=1000, c=1e-8
), you'll need to rescale things so that they're similar in magnitude. Otherwisescipy.optimize
's "hill climbing" algorithms (like LM) won't accurately calculate the estimate the local gradient, and will give wildly inaccurate results. For now, I'm assuming thata
,b
, andc
have relatively similar magnitudes. Also, be aware that essentially all non-linear methods require you to make an initial guess, and are sensitive to that guess. I'm leaving it out below (just pass it in as thep0
kwarg tocurve_fit
) because the defaulta, b, c = 1, 1, 1
is a fairly accurate guess fora, b, c = 3, 2, 1
.With the caveats out of the way,
curve_fit
expects to be passed a function, a set of points where the observations were made (as a singlendim x npoints
array), and the observed values.So, if we write the function like this:
We'll need to wrap it to accept slightly different arguments before passing it to
curve_fit
.In a nutshell:
Stand-alone Example of the two methods:
To give you a full implementation, here's an example that