I used the following to run forecast::Acf
over about 200 columns. Now I would like to generate a boxplot showing the distribution of correlation values at lag 1:36.
## a simple example
d <- data.frame(ts1 = rnorm(100), ts2 = rnorm(100))
acorr <- apply(d, 2, Acf)
What I now want is a boxplot where x-values are 1,2 and the y-values are ACF for ts1
and ts2
.
Suppose you have multiple time series stored in a data frame d
(each column is one series), we can use the following to obtain ACF up to lag 36 (nrow(d) >> 36
to make sense!):
## for data frame `d`
acfs <- sapply(d, function (u) c(acf(u, lag.max = 36, plot = FALSE)$acf)[-1])
- R base function
acf
is sufficient for the job; set lag.max = 36
and plot = FALSE
;
acf
returns a list, and we want $acf
field. Note, this is a 3D arrary, so we want to flatten it into a vector using c()
;
- ACF at lag 0 is 1 and is not of interest, so we drop it by
[-1]
;
sapply
would return a matrix, each column giving ACF for each series.
In case you have time series stored in a matrix (either an ordinary matrix or one with "mts" class), we use apply
rather than sapply
:
## for matrix `d`
acfs <- apply(d, 2L, function (u) c(acf(u, lag.max = 36, plot = FALSE)$acf)[-1])
To produce a boxplot, simply use:
boxplot(acfs)
Why is $acf
a 3D array. Because acf
function can handle multiple time series directly. Try:
## whether `d` is data frame or matrix, it is converted to "mts" inside `acf`
oo <- acf(d, lag.max = 36, plot = FALSE)$acf
The problem is, in this case, cross-correlation (CCF) is also computed.
On the x-axis I want 1-36, not ts1
and ts2
. I need the distribution of at each lag over time series. If you can fix that your answer is very good.
Ei? Did I misread your question. Well, in that case, you just boxplot
the transpose of acfs
:
boxplot(t(acfs))