I am interested in generating an array(or numpy Series) of length N that will exhibit specific autocorrelation at lag 1. Ideally, I want to specify the mean and variance, as well, and have the data drawn from (multi)normal distribution. But most importantly, I want to specify the autocorrelation. How do I do this with numpy, or scikit-learn?
Just to be explicit and precise, this is the autocorrelation I want to control:
numpy.corrcoef(x[0:len(x) - 1], x[1:])[0][1]
If you are interested only in the auto-correlation at lag one, you can generate an auto-regressive process of order one with the parameter equal to the desired auto-correlation; this property is mentioned on the Wikipedia page, but it's not hard to prove it.
Here is some sample code:
The parameter
corr
lets you set the desired auto-correlation at lag one and the optional parameters,mu
andsigma
, let you control the mean and standard deviation of the generated signal.