converting a data frame to monthly time series

2020-03-15 05:18发布

问题:

I have a data frame of a monthly data for 100 yrs (1200 data points) with the months in columns and years in the rows. I want to convert it into a monthly time series and I have tried several ways, none of which create the correct "temporal" structure.

The problem lies with R considering the data frame as a 100 observations (years) of 12 variables (the months). Here is a reproducible code for my latest try:

set.seed(12)
dummy.df <- as.data.frame(matrix(round(rnorm(1200),digits=2),nrow=100,ncol=12))
rownames(dummy.df) <- seq(from=1901, to=2000)
colnames(dummy.df) <- c("jan","feb","mar","apr","may","jun","jul","aug","sep","oct","nov","dec")
dummy.df.ts <- ts(as.vector(as.matrix(dummy.df)), start=c(1901,1), end=c(2000,12), frequency=12)

In the "dummy.df.ts" object, the rows and columns are switched and instead of sequential observations in columns, all the januarys februarys etc are stacked together one after the other. How can I get to the correct temporal structure?

An example of my data: these are monthly temperature values from 1901 - 1905

fr.monthly.temp.sample  

     JAN FEB MAR  APR  MAY  JUN  JUL  AUG  SEP  OCT NOV DEC  
1901 2.7 0.4 4.7 10.0 13.0 16.9 19.2 18.3 15.7 10.6 4.9 3.5  
1902 4.1 3.2 7.5 10.3 10.0 15.1 18.2 17.4 15.0 10.2 6.3 3.5  
1903 3.8 5.9 7.6  7.1 12.9 14.9 17.6 17.3 15.5 12.1 6.9 2.7  
1904 3.0 4.6 5.5 10.3 13.6 16.3 20.2 18.5 13.9 11.2 5.4 4.8  
1905 1.7 4.0 7.4  9.3 11.9 16.5 20.0 17.6 14.7  8.4 5.5 3.8  

And by using this ts() call:

fr.monthly.temp.sample.ts <- ts(as.vector(as.matrix(fr.monthly.temp.sample)),                              start=c(1901,1), end=c(1905,12), frequency=12)

This is the output I get for the time series object:

fr.monthly.temp.sample.ts  

      Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec  
1901  2.7  4.1  3.8  3.0  1.7  0.4  3.2  5.9  4.6  4.0  4.7  7.5  
1902  7.6  5.5  7.4 10.0 10.3  7.1 10.3  9.3 13.0 10.0 12.9 13.6  
1903 11.9 16.9 15.1 14.9 16.3 16.5 19.2 18.2 17.6 20.2 20.0 18.3  
1904 17.4 17.3 18.5 17.6 15.7 15.0 15.5 13.9 14.7 10.6 10.2 12.1  
1905 11.2  8.4  4.9  6.3  6.9  5.4  5.5  3.5  3.5  2.7  4.8  3.8  

--Note the changed temporal structure (values from the columns are now in the rows..)--

Thanks.

回答1:

Solution 1

You could transpose (function t()) the matrix before vectorizing it:

set.seed(12)
dummy.df <- as.data.frame(matrix(round(rnorm(1200), digits = 2),
                                 nrow = 100, ncol = 12))
rownames(dummy.df) <- seq(1901, 2000)
colnames(dummy.df) <- month.abb
dummy.df.ts <- ts(as.vector(t(as.matrix(dummy.df))), 
                  start=c(1901,1), end=c(2000,12), frequency=12)

Solution 2

You could melt the data, order by date, then apply the ts() function.

Here's the data setup. If your language setting is English you could save some code by using month.abb, but that is not robust to other language locales.

set.seed(12)
dummy.df <- as.data.frame(matrix(round(rnorm(1200),digits=2),nrow=100,ncol=12))
months <- format(seq.Date(as.Date("2013-01-01"), as.Date("2013-12-01"), 
                          by = "month"), format = "%b")
colnames(dummy.df) <- months
dummy.df$Year <- seq(1901, 2000) # set as variable, not as rownames 

Melt the data so you have a data frame with 1200 rows, each representing an observation:

library("reshape2")
dummy.df <- melt(dummy.df, id.vars = "Year")

Order the observations by date:

dummy.df$Date <- as.Date(paste(dummy.df$Year, dummy.df$variable, "01", sep = "-"),
                         format = ("%Y-%b-%d"))
dummy.df <- dummy.df[order(dummy.df$Date), ]

Then you can apply a similar ts() call, with the ts object showing the desired order:

dummy.df.ts <- ts(dummy.df$value, start=c(1901,1), end=c(2000,12), frequency=12)