Produce a boxplot for multiple ACFs

2019-07-23 15:02发布

问题:

I used the following to run forecast::Acf over about 200 columns. Now I would like to generate a boxplot showing the distribution of correlation values at lag 1:36.

## a simple example
d <- data.frame(ts1 = rnorm(100), ts2 = rnorm(100))
acorr <- apply(d, 2, Acf)

What I now want is a boxplot where x-values are 1,2 and the y-values are ACF for ts1 and ts2.

回答1:

Suppose you have multiple time series stored in a data frame d (each column is one series), we can use the following to obtain ACF up to lag 36 (nrow(d) >> 36 to make sense!):

## for data frame `d`
acfs <- sapply(d, function (u) c(acf(u, lag.max = 36, plot = FALSE)$acf)[-1])
  • R base function acf is sufficient for the job; set lag.max = 36 and plot = FALSE;
  • acf returns a list, and we want $acf field. Note, this is a 3D arrary, so we want to flatten it into a vector using c();
  • ACF at lag 0 is 1 and is not of interest, so we drop it by [-1];
  • sapply would return a matrix, each column giving ACF for each series.

In case you have time series stored in a matrix (either an ordinary matrix or one with "mts" class), we use apply rather than sapply:

## for matrix `d`
acfs <- apply(d, 2L, function (u) c(acf(u, lag.max = 36, plot = FALSE)$acf)[-1])

To produce a boxplot, simply use:

boxplot(acfs)


Why is $acf a 3D array. Because acf function can handle multiple time series directly. Try:

## whether `d` is data frame or matrix, it is converted to "mts" inside `acf`
oo <- acf(d, lag.max = 36, plot = FALSE)$acf

The problem is, in this case, cross-correlation (CCF) is also computed.


On the x-axis I want 1-36, not ts1 and ts2. I need the distribution of at each lag over time series. If you can fix that your answer is very good.

Ei? Did I misread your question. Well, in that case, you just boxplot the transpose of acfs:

boxplot(t(acfs))