Derivative of Kernel Density

2019-07-01 23:21发布

I am using density {stats} to construct a kernel "gaussian' density of a vector of variables. If I use the following example dataset:

    x <- rlogis(1475, location=0, scale=1)  # x is a vector of values - taken from a rlogis just for the purpose of explanation
    d<- density(x=x, kernel="gaussian")

Is there some way to get the first derivative of this density d at each of the n=1475 points

标签: r derivative
2条回答
你好瞎i
2楼-- · 2019-07-01 23:26

The curve of a density estimator is just the sum of all the kernels, in your case a gaussian (divided by the number of points). The derivative of a sum is the sum of the derivatives and the derivative of a constant times a function is that constant times the derivative. So the derivative of the density estimate at a given point will just be the average of the slopes for the 1475 different gaussian curves at that given point. Each gaussian curve will have a mean corresponding to each of the data points and a standard deviation based on the bandwidth. So if you can calculate the slope for a gaussian, then finding the slope for the density estimate is just a mean of the 1475 slopes.

查看更多
一纸荒年 Trace。
3楼-- · 2019-07-01 23:44

Edit #2:

Following up on Greg Snow's excellent suggestion to use the analytical expression for the derivative of a Gaussian, and our conversation following his post, this will get you the exact slope at each of those points:

s <- d$bw; 
slope2 <- sapply(x, function(X) {mean(dnorm(x - X, mean = 0, sd = s) * (x - X))})
## And then, to compare to the method below, plot the results against one another
plot(slope2 ~ slope)

Edit:

OK, I just reread your question, and see that you wanted slopes at each of the points in the input vector x. Here's one way you might approximate that:

slope <- (diff(d$y)/diff(d$x))[findInterval(x, d$x)]

A possible further refinement would be to find the location of the point within its interval, and then calculate its slope as the weighted average of the slope of the present interval and the interval to its right or left.


I'd approach this by averaging the slopes of the segments just to the right and left of each point. (A bit of special care needs to be taken for the first and last points, which have no segment to their left and right, respectively.)

dy <- diff(d$y)
dx <- diff(d$x)[1]  ## Works b/c density() returns points at equal x-intervals
((c(dy, tail(dy, 1)) + c(head(dy, 1), dy))/2)/dx
查看更多
登录 后发表回答