Is there a way to calculate mean and standard deviation for a vector containing samples using Boost?
Or do I have to create an accumulator and feed the vector into it?
Is there a way to calculate mean and standard deviation for a vector containing samples using Boost?
Or do I have to create an accumulator and feed the vector into it?
I don't know if Boost has more specific functions, but you can do it with the standard library.
Given
std::vector<double> v
, this is the naive way:This is susceptible to overflow or underflow for huge or tiny values. A slightly better way to calculate the standard deviation is:
UPDATE for C++11:
The call to
std::transform
can be written using a lambda function instead ofstd::minus
andstd::bind2nd
(now deprecated)://means deviation in c++
/A deviation that is a difference between an observed value and the true value of a quantity of interest (such as a population mean) is an error and a deviation that is the difference between the observed value and an estimate of the true value (such an estimate may be a sample mean) is a residual. These concepts are applicable for data at the interval and ratio levels of measurement./
}
My answer is similar as Josh Greifer but generalised to sample covariance. Sample variance is just sample covariance but with the two inputs identical. This includes Bessel's correlation.
2x faster than the versions before mentioned - mostly because transform() and inner_product() loops are joined. Sorry about my shortcut/typedefs/macro: Flo = float. CR const ref. VFlo - vector. Tested in VS2010
Improving on the answer by musiphil, you can write a standard deviation function without the temporary vector
diff
, just using a singleinner_product
call with the C++11 lambda capabilities:I suspect doing the subtraction multiple times is cheaper than using up additional intermediate storage, and I think it is more readable, but I haven't tested the performance yet.
Using accumulators is the way to compute means and standard deviations in Boost.