可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have a model of the form: y = x + noise. I know the distribution of 'y' and of the noise and would like to have the distribution of 'x'. So I tried to deconvolve the distributions with R. I found 2 packages (decon and deamer) and I thought both methods should make more or less the same but I don't understand why deconvoluting with DeconPdf gives me a something like a normal distribution and deconvoluting with deamerKE gives me a uniform distribution. Here is an example code:
library(fitdistrplus) # for rweibull
library(decon) # for DeconPdf
library(deamer) # for deamerKE
set.seed(12345)
y <- rweibull(10000, shape=5.780094, scale=0.00204918)
noise <- rnorm(10000, mean=0.002385342, sd=0.0004784688)
sdnoise <- sd(noise)
est <- deamerKE(y, noise.type="Gaussian",
mean(noise), sigma=sdnoise)
plot(est)
estDecon <- DeconPdf(y, sdnoise, error="normal", fft=TRUE)
plot(estDecon)
Edit (in response to Julien Stirnemann):
I am not sure about re-parametrizing. My actual problem is:
I have reaction time (RT) which theoretically can be described as f(RT) = g(discrimination time) + h(selection time), where f,g and h are can be transformations of those time values.
I have "RT" and "discrimination time" values in my dataset. And I am interested in selection time or maybe h(selection time). With kernel density estimation I found out that the weibull distribution fits the 1/RT values best, while normal distribution fits 1/(discrimination time) best.
That is why I can write my problem as 1/RT = 1/(discrimination time) + h(selection time) or y = x + noise (where I considered the noise to be 1/(discrimination time)). Simulating those reaction times gave me the following distribution with the following parameters:
y <- rweibull(10000, shape=5.780094, scale=0.00204918)
noise <- rnorm(10000, mean=0.002385342, sd=0.0004784688)
What do you mean with re-parametrizing? Using different values e.g. for the scale parameter?
回答1:
As an answer to your last comment: the error shifts the observed values. The signal you wish to deconvolve is somewhere between 0 and ~ 0.3 i guess. Here is some code using deamer:
library(actuar) # for rinvweibull
library(deamer)
set.seed(123)
RT <- rinvweibull(30000, shape=5.53861156, scale=488)/1000
RT <- RT[RT<1.5]
noise <- 1/rnorm(30000, mean=0.0023853421, sd=0.0004784688)/1000
noise <- noise[noise<1.5]
ST <- deamerSE(RT, errors=noise, from=0, to=0.3)
plot(ST)
This is what you will get using non-parametric deconvolution (regardless of implementations, packages etc.). Just for your information, your signal-to-noise ratio is extremely low...what you observe is actually almost only noise. This strongly impacts on the estimation of the density you are interested in, especially when using nonparametric methods just like trying to find a needle in a haystack. You should reconsider estimating the density, and rather trying to get only a few quantities of interest...
Good luck,
Julien Stirnemann
回答2:
There are several problems in your post. First: In nonparametric deconvolution problems you usually do not 'know' the distribution of 'y'. Rather you have a sample of 'y' that you assume observed with an additive noise, 'x' is unobserved. No assumptions are made on 'y' or 'x' but only on 'noise'. Your presentation seems to imply that you are considering a parametric problem (for which neither deamer or decon are of any help). Second: be careful, you are considering a non-centered noise...which deamer can deal with but not decon.
Here is an example of code:
library(decon) # for DeconPdf
library(deamer) # for deamerKE
set.seed(12345)
shape=5; scale=1; mu=0; sd=0.2
x <- rweibull(5000, shape=shape, scale=scale)
noise <- rnorm(5000, mean=mu, sd=sd)
y=x+noise
curve(dweibull(x,shape,scale),lwd=2, from = 0, to = 2)
est <- deamerKE(y, noise.type="Gaussian", mu=mu, sigma=sd, from=0, to=2)
lines(est)
estDecon <- DeconPdf(y, sd, error="normal", fft=TRUE)
lines(estDecon, lty=2)
legend('topright', lty=c(1,1,2), lwd=c(2,1,1),
legend=c("true", "deamerKE", "DeconPdf"))
As you see from the plot, even with a centered noise (mu=0 in my example), the estimate is better with deamer: this is because of adaptive estimation. You could probably obtain similar results with decon though, but you would have to tune the bandwidth parameter using the functions provided for that in the package.
Regarding the parameters you gave, the Fourier transforms are extremely "flat". This makes it very difficult for any all-purpose implementation to select an appropriate bandwidth parameter (either adaptively as in deamer or using an estimation as in decon).
Playing around with the bandwidth parameter in deconPdf doesn't help either, probably because of numeric limits. Your problem would require some fine tuning in the deamer functions' code to allow exploration of larger collections of models. This would also dramatically increase estimation time. Could you should rather consider re-parametrizing your problem in some way?
Best,
Julien Stirnemann
回答3:
Following your second post: I'm not sure I fully understand your problem. However, from what I understand, 2 possibilities:
1) without using any transformation function, selection time = RT - discrimination time. If RT and discrimination time are both observed on each individual of your dataset, selection time is deterministically known - and this has nothing to do with deconvolution.
2) If RT is observed in one i.i.d. sample and selection time in another independant sample, then yes, the only way out is to consider deconvolution density estimation. However, although you have made some parametric assumptions using fitting methods, you truly do not know the densities of RT or DT. Considering DT as noise, your problem is:
RT = ST + noise, with an auxiliary sample of i.i.d. noise given by your sample of discrimination time. You wish to estimate the density of ST which is unobserved. The only package that can perform deconvolution in this situation is deamer (as far as I know) with the deamerSE function. If I stated your problem correctly you should look at the examples in the manual. I would also recommend using the raw data without transformations (at least in a first analysis).
An example:
deamerSE(RT, errors=DT)
Here again your error is not centered (it is positive), therefore you will have to adjust from and to, to account for the shift the error has generated...this also is in the examples of the deamer manual.
Best,
Julien Stirnemann