I have a time-series panel dataset which is structured in the following way:
df <- data.frame(
year = c(2012L, 2013L, 2014L, 2012L, 2013L, 2014L),
id = c(1L, 1L, 1L, 2L, 2L, 2L),
c = c(11L, 13L, 13L, 16L, 15L, 15L)
)
#> year id c
#> 1 2012 1 11
#> 2 2013 1 13
#> 3 2014 1 13
#> 4 2012 2 16
#> 5 2013 2 15
#> 6 2014 2 15
I would like to find the cross-correlation between values in column C given their id number. Something similar to this:
#> 1 2
#> 1 1 0.8
#> 2 0.8 1
I have been using dplyr package to find the cross-correlation between two variables in my panel data but for some reason, I can't do the same for cross correlation in one veriable grouped by id.
If you are already using
tidyverse
tools, you should trywidyr
.Its functions reshape to wide, get the correlations, and give you back a tidy data frame again.
(Note I changed the sample data slightly to match akaDrHouse's answer.
Do you mean something like the following? I used the reshape package to cast based on the value of your id, followed by the
cor()
function in baseR.So @Henrik's comment was much more simple and elegant, so including here.