I'm trying to fit a Gaussian for my data (which is already a rough gaussian). I've already taken the advice of those here and tried curve_fit
and leastsq
but I think that I'm missing something more fundamental (in that I have no idea how to use the command).
Here's a look at the script I have so far
import pylab as plb
import matplotlib.pyplot as plt
# Read in data -- first 2 rows are header in this example.
data = plb.loadtxt('part 2.csv', skiprows=2, delimiter=',')
x = data[:,2]
y = data[:,3]
mean = sum(x*y)
sigma = sum(y*(x - mean)**2)
def gauss_function(x, a, x0, sigma):
return a*np.exp(-(x-x0)**2/(2*sigma**2))
popt, pcov = curve_fit(gauss_function, x, y, p0 = [1, mean, sigma])
plt.plot(x, gauss_function(x, *popt), label='fit')
# plot data
plt.plot(x, y,'b')
# Add some axis labels
plt.legend()
plt.title('Fig. 3 - Fit for Time Constant')
plt.xlabel('Time (s)')
plt.ylabel('Voltage (V)')
plt.show()
What I get from this is a gaussian-ish shape which is my original data, and a straight horizontal line.
Also, I'd like to plot my graph using points, instead of having them connected. Any input is appreciated!
Here is corrected code:
result:
You get a horizontal straight line because it did not converge.
Better convergence is attained if the first parameter of the fitting (p0) is put as max(y), 5 in the example, instead of 1.
should be
There is another way of performing the fit, which is by using the 'lmfit' package. It basically uses the cuve_fit but is much better in fitting and offers complex fitting as well. Detailed step by step instructions are given in the below link. http://cars9.uchicago.edu/software/python/lmfit/model.html#model.best_fit
Explanation
You need good starting values such that the
curve_fit
function converges at "good" values. I can not really say why your fit did not converge (even though the definition of your mean is strange - check below) but I will give you a strategy that works for non-normalized Gaussian-functions like your one.Example
The estimated parameters should be close to the final values (use the weighted arithmetic mean - divide by the sum of all values):
I personally prefer using numpy.
Comment on the definition of the mean (including Developer's answer)
Since the reviewers did not like my edit on #Developer's code, I will explain for what case I would suggest an improved code. The mean of developer does not correspond to one of the normal definitions of the mean.
Your definition returns:
Developer's definition returns:
The weighted arithmetic mean:
Similarly you can compare the definitions of standard deviation (
sigma
). Compare with the figure of the resulting fit:Comment for Python 2.x users
In Python 2.x you should additionally use the new division to not run into weird results or convert the the numbers before the division explicitly:
or e.g.
After losing hours trying to find my error, the problem is your formula:
sigma = sum(y*(x-mean)**2)/n this is wrong, the correct formula is the squareroot of this!;
sqrt(sum(y*(x-mean)**2)/n)
Hope this helps