I have some time series data that looks like this:
x <- c(0.5833, 0.95041, 1.722, 3.1928, 3.941, 5.1202, 6.2125, 5.8828,
4.3406, 5.1353, 3.8468, 4.233, 5.8468, 6.1872, 6.1245, 7.6262,
8.6887, 7.7549, 6.9805, 4.3217, 3.0347, 2.4026, 1.9317, 1.7305,
1.665, 1.5655, 1.3758, 1.5472, 1.7839, 1.951, 1.864, 1.6638,
1.5624, 1.4922, 0.9406, 0.84512, 0.48423, 0.3919, 0.30773, 0.29264,
0.19015, 0.13312, 0.25226, 0.29403, 0.23901, 0.000213074755156413,
5.96565965097398e-05, 0.086874, 0.000926808687858284, 0.000904641782399267,
0.000513042259030044, 0.40736, 4.53928073402494e-05, 0.000765719624469057,
0.000717419263673946)
I would like to fit a curve to this data, using mixtures of one to five Gaussians. In Matlab, I could do the following:
fits{1} = fit(1:length(x),x,fittype('gauss1'));
fits{2} = fit(1:length(x),x,fittype('gauss2'));
fits{3} = fit(1:length(x),x,fittype('gauss3'));
... and so on.
In R, I am having difficulty identifying a similar method.
dat <- data.frame(time = 1:length(x), x = x)
fits[[1]] <- Mclust(dat, G = 1)
fits[[2]] <- Mclust(dat, G = 2)
fits[[3]] <- Mclust(dat, G = 3)
... but this does not really seem to be doing quite the same thing. For example, I am not sure how to calculate the R^2 between the fit curve and the original data using the Mclust
solution.
Is there a simpler alternative in base R to fitting a curve using a mixture of Gaussians?
Function
With the code given below, and with a bit of luck in finding good initial parameters, you should be able to curve-fit Gaussian's to your data.
In the function
fit_gauss
, aim is toy ~ fit_gauss(x)
and the number of Gaussians to use is determined by the length of the initial values for parameters:a
,b
,d
all of which should be equal lengthI have demonstrated curve-fitting of OP's data up to three Gaussian's.
Specifying Initial Values
This it pretty much most work I have done with
nls
(thanks to OP for that). So, I am not quite sure what is the best method select the initial values. Naturally, they depend on height's of peaks (a
), mean and standard deviation ofx
around them (b
andd
).One option would be for given number of Gaussian's, try with a number of starting values, and find the one that has best fit based on residual standard error
fit$sigma
.I fiddled a bit to find initial parameters, but I dare say the parameters and the plot with three Gaussian model looks solid.
Fitting one, two and thee Gaussian's to Example data
Single Gaussian
Two Gaussian's
Three Gaussian's
Summery of fit with three Gaussian