-->

Population Variance in r

2020-06-01 04:23发布

问题:

How can I calculate the population variance of my data using R?

I read there is a package called popvar but I have the Version 0.99.892 and I don't find the package

回答1:

The var() function in base R calculate the sample variance, and the population variance differs with the sample variance by a factor of n / n - 1. So an alternative to calculate population variance will be var(myVector) * (n - 1) / n where n is the length of the vector, here is an example:

x <- 1:10
var(x) * 9 /10
[1] 8.25

From the definition of population variance:

sum((x - mean(x))^2) / 10
[1] 8.25 


回答2:

You already have a great answer, but I'd like to show that you can easily make your own convenience functions. It is surprising that a population variance/standard deviation function is not available in base R. It is available in Excel/Calc and other software. It wouldn't be difficult to have such a function. It could be named sdp or sd.p or be invoked with sd(x, pop = TRUE)

Here is a basic version of population variance with no type-checking:

  x <- 1:10
  varp <- function(x) mean((x-mean(x))^2)
  varp(x)
  ## [1] 8.25

To scale up, if speed is an issue, colSums and/or colMeans may be used (see: https://rdrr.io/r/base/colSums.html)



回答3:

You can find the details on package popvar here: https://cran.r-project.org/web/packages/PopVar/index.html - You can install it using the command install.packages("PopVar"); Note that the name is case sensitive (capital P, capital V).



回答4:

You can calculate the population variance with the following function:

pvar <- function(x) {
  sum((x - mean(x))**2) / length(x)
}

where x is a numeric vector that keeps the data of your population. For example:

> x <- c(1, 3, 5, 7, 14)
> pvar(x)
[1] 20


标签: r variance